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ANALYTIC CONTINUATION OF DIAGONALS AND 
HADAMARD COMPOSITIONS OF MULTIPLE 
POWER SERIES* 


BY 
R. H. CAMERON AND W. T. MARTIN 


1. Introduction. A problem that has always interested mathematicians is 
the extension of an analytic function defined by a power series outside of its 
region of convergence. The classical Hadamard theorem} furnishes one 
method of doing this for functions of one variable. For this theorem gives 
us information about the analytic continuation of A(z) =) provided 
we can find a way of factoring the coefficients a, into products b,c, such that 
B(z) =>°b,2" and C(z) =>_c,2" are functions whose analytic continuations are 
known. Even if we are not able to factor the a, in such a way, we may be able 
to find a known function F(x, y) => fnnx™y" whose coefficient matrix {fnn} 
has {a,} as its diagonal (fnn=@,). One of the results of the present paper fur- 
nishes information concerning the region of analyticity of A(z) in terms of the 
region of analyticity of F(x, y). In case F(x, y) = B(x)C(y) this method re- 
duces to the preceding. 

If we pass now to higher dimensions, our general point of view suggests a 
number of different possibilities. For example, suppose thai we wish to in- 
vestigate the analytic continuation of A(x, y) =) lamax™y". If we can find a 
known function x2, yi, V2, Ys) yy" for which 
Onn =fmmnnn, then simply by iterating our primary theorem we can draw con- 
clusions concerning the analytic extension of A from our knowledge of the 
analytic extension of F. Similarly we can obtain information concerning 
A(x, y, 2) =) if we can find a known function F(x1, x2, V2, 21, 22) 
fim 12072 for which dmnp=fmmnnpp- In fact our 
basic theorem can be iterated in such a way as to apply to any distribution 
of dimensions. 

Just as our first result contains as a special case the classical Hadamard 
theorem, so these results contain an -dimensional generalization of Hada- 
mard’s theorem. The regions we consider in the present paper are star-shaped 
and the nature of the singularities of the functions plays no role whatever. 


* Presented to the Society, December 29, 1936; received by the editors May 26, 1937. 
{ J. Hadamard, Théoréme sur les séries entiéres, Acta Mathematica, vol. 22 (1898), pp. 55-64. 
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Haslam-Jones* has obtained a two-dimensional analogue of the Hada- 
mard theorem which furnishes information about a function A(x, y) if there 
are two known functions B(x, y), C(x, y) whose coefficients satisfy the rela- 
tions DmnCmn=4mn and whose singularities satisfy certain conditions. If we 
compare his theorem with our two-dimensional theorem we see that neither 
includes the other. For our result does not always furnish as extensive an 
analytic continuation as that of Haslam-Jones, but it deals with a wider class 
of functions. 

Grouping the terms into homogeneous polymonials, Bochner and Martint 
have obtained an u-dimensional analogue of the Hadamard theorem. While 
(for n >1) their theorem does not deal directly with the individual coefficients, 
it does have the advantage of invariance under any affine transformation. 

2. Stars. An important class of regions with which we shall be concerned 
in this paper is the class of star-shaped regions. 


DeriniTion. A region S in a space R, of m complex variables 2, - - - , Zn 
is called a (pi, - - - , p»)-Star if with the point (2, - - - , 2,) it contains also all 
the points - - - , p?9z,), where 0<p<1 and the p; are positive integers. 
In case pi=p2= --- =pr=p, the (pi,---, p,)-star is simply called a star. 


It is clear that only the ratios of the p’s are significant. It is also obvious 
that the union and intersection of two (fu, - - -, p,)-stars are also (fi, - - Pa)- 
stars. 

Noration. If Sisa (pi, ---, p,)-starin R,, the symbol S,.,.,) will denote 
the point set consisting of all points P(w, 23, 24, - - - , 2n) in R»-1 such that to 
P there corresponds a continuous positive periodic function r =r(@) of period 
2m for which each of the points Q(6): (re‘w, r-te-**, 23, - - - , belongs to S, 
(0<@S2rz). 

We shall show that S (2,2, iS @ (pit pe, Ps, , Pn)-star. Let P(w, 2, , Zn) 
be a point of S*=S,,,.,) and let r(6) be the associated function. Then since 
S isa (pi,---, px)-star, each of the points 

(pre®w, pPizs, +--+ , p?™Zn), (OS p51,05082n), 
belongs to S. Thus if R,(6) =7r(6)p-™, all of the points 
p?8z3,--- , p?*Zn), (0<p51,0568 


* U. S. Haslam-Jones, An extension of Hadamard’s multiplication theorem, Proceedings of the 
London Mathematical Society, (2), vol. 27 (1928), pp. 223-232. 

¢ S. Bochner and W. T. Martin, Singularities of composite functions in several variables, Annals of 
Mathematics, (2), vol. 38 (1937), pp. 293-302. 

t B. Almer, Sur quelques problémes de la théorie des fonctions analytiques de deux variables com- 
plexes, Arkiv fér Matematik, Astronomi och Fysik, vol. 17 (1922), pp. 1-70; W. T. Martin, Special 
regions of regularity of functions of several complex variables, Annals of Mathematics, (2), vol. 38 (1937), 
pp. 602-625. 
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are in S, and hence all of the points 
p?z”), (0 <ps 1), 
are in S*. 

It only remains to show that S* is an open set, as connectedness obviously 
follows from the last statement. (It is trivial to show that (w=0, 2;=0,---, 
Zn =0) belongs to S* since S contains the origin.) To show that S* is open, 
consider a point Po(w™, 23, - - - , zn) of it. Since S is open, with (re*w, 
there is contained an n-dimensional neighborhood 
N (8) of Qo(8) in S, 

N(6): | | < | Ze — <0), | 23 — 2; | < 

G 3, n). 
Because of the continuity of r(6) it is clear that there is a positive e independ- 
ent of @such that for alldin0 <0 < 2rall of the points for which| z:—re®w | <e, 
| | z;-2;| <e, (j=3,---, m), belong to S. Let 


ax 
Thus when 


| w — w) <8, | 23 — 23 | <5,---, | zn — | <6, 


the point (re‘w, 23,---, gn) is in the e-neighborhood of Qo(@) and 
hence in S. Thus S* is a (pitp2, ps, --- , Pn)-star. We shall call S*=S,,,,,) a 
contracted star of S. More general contractions will later be given. 


DEFINITION. Let f(%, - - - , Zn) be any function analytic in a neighborhood 
of the origin. Let us define a point set S=S(f; pi, - - -, Pn) as follows: A point 
(z1,--- , Zn) belongs to S if f is regular at all of the points (pz, - - - , p?*z,), 
O<p3X1. It is clear that S is a (pf:,---,,)-star; we shall call S the 
(pi,- ++, Pa)-star of f. We shall denote by S(f) the (1, ---, 1)-star of f. 


3. Proof of the central theorem. We shall now prove the following exten- 
sion of the Hadamard theorem on which our remaining results will be based: 


THEOREM 1. Let 
be analytic in a neighborhood of the origin, and let S be its (pi, --- , pn)-star. 
Then the diagonal function 


is analytic in the contracted star S* =S(.,2,). 


| 
| 
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For the proof let Po(w, 23, --- ,2,) be any point in S*. Then by 
definition of S* there exists a continuous positive periodic function r(@) 
=r(6; Po) such that all of the points 


are in S. Since S is open it is clear that for sufficiently small positive 6 all the 
points (r*e#w, . are in S, where 


1 64+6 
0; = — f 
6 


Then the set of points ‘=r*(@)e* forms a rectifiable closed curve C* = C*(Po) 
around the origin. For ¢ on C* and (w, 2,---, Zn) in a sufficiently small 
neighborhood VN =N,(P») of Po, 


NAPs): |w-—w™| <e, | | <e,---, | — | <e, 


the function A (wt, t-, 23, - - - , gn) is analytic in all its variables. Then since 

Sis a (pi,---, pna)-star, it follows that A(u1wt, - , is 

analytic in w, 23, -- , Zn, for P(w, a3, - - - , n)eN (Po), teC*,OSu<1. Thus 

dt 

c+ t 

is analytic in w, 23, - - - , Zn, # when PeN and 0<u<1. Now let R be positive 

and such that the series (1) converges whenever |z;| <R, (j=1, --- , ), and 
let y and I be the minimum and maximum values of | ¢| for ¢ on C*. Let 


{ R R 
5; = min , (Ry), | ———— 


for 7=3,---,m. Then if O<u<6,, and PeN, 


g(w, 23, * U; P») 
dt 


1 
. n 
2ri ce t 


* * n=O 


= A (2,2) u”3Z3, 


k=0 (pit D2) m+ 


= > B,(P)u*. 


k=O 


4 
(4) 
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Since for a fixed PeN this is an analytic function of u for 0<u<1, its value 
at u=1 is given by the Le Royt sum of the last series. Thus 
2. T(ks + 1) 
5 P,1; Po) = 23) °° * 1; Po) = li B, (P). 
(S) 0) = g(w, Zs 0) tim T(e +1) 
Since g(P, u; Po) is analytic for PeN ,O<u 1, g(P, 1; Po) is certainly analytic 
for PeN. But the summation in (5) is independent of Po, and in particular 
it is defined when P = Po, while Py may be any point in S*. Thus 
T(ks + 1) 
A(P) = li B, (P 
is defined throughout S*; and since g(P, 1, Po) =A(P) throughout some 
N,(Po), it follows that A(P) is regular at all points of S*. Since the Le Roy 
method of summation is regular, it follows that 


A(P) = Bi(P) = 
k=0 
for P sufficiently close to the origin. Thus our theorem is proved. 

4. Scope of Theorem 1. As an indication of the scope of Theorem 1, we 
shall show that it contains the classical Hadamard theorem as a special case. 
Indeed, if A (z:, 22) =_B(z:)C(z2), where B and C are any two functions analytic 
in the neighborhood of this origin, then the diagonal function A (.,.,) is the 
Hadamard composite of B and C. Thus if &e* and me are the star vertices 
of B and C, it is only necessary to show that whenever 
(6) OSc<g.l.b. 

0S¢S2r 
then the point ce*” belongs to S,.,.,), and hence by Theorem 1 is a regular 
point of A (4,2,). We shall show this by proving the existence of a continuous 
positive periodic function 7(@) such that for all @ 


1 
7 < —< 7-0. 
(7) cr(0) < Ey+0 
In the first place it follows from (6) that the set $ of points w=#e* defined by 
the inequalities 
1 


(8) — <i< 
n-@ Cc 


t See for example, L. L. Smail, Theory of Summable Infinite Processes, Oregon, 1925, pp. 13, 14. 
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contains points on each ray arg w=8. Also due to the continuity of the func- 
tions z~! and c—'z and the fact that S(B) and S(C) are open regions, the set B 
is an open region. It is obvious that we can find a 6-neighborhood of any @ 
throughout which there exists a positive continuous function r:(@) which satis- 
fies (7). Thus by the Heine-Borel theorem there is a positive periodic func- 
tion 72(@) which has at most a finite number of finite jumps in each period 
interval and which always satisfies (7). Due to (8) we can join the ends of 
the jumps and obtain a multiple-valued function satisfying (7); and since S 
is open we can modify this function so as to make it single-valued. Thus 7(6) 
exists and ce‘’ belongs to S,.,,). 

5. Generalizations. Since the contraction of a star is a star, the process of 
contraction may be iterated. In order to have the resulting stars as extensive 
as possible, and independent of the order, we make the following definition: 


DerFIniTIon. If S is a (pi, - - , p»)-star, and if 
T = U = 
then we define 


where the sum is a union in the sense of point sets. It is clear that it is a 
(pit pe, Pst ps, ps, - ~~ , Pn)-star. Similarly we define 


S (e122) (2324) (252) = [S (2324) ] (2526) + (2525) ] (2324) + [S ] (2122) 


and 
where 
T = S 225); U = V = 
Contractions of higher orders are defined by the obvious extension of these 
definitions. 


THEOREM 2. Let (1) be analytic in a neighborhood of the origin and let 
S=S(f; pi, Pn) be its (pi, - , Pn)-star. Then the diagonal functions 


md wu; Bn 
(9) A (2,22) (212, Zs4, 25) °° * Zn) = * Sn 


and 
Bn 


m 
(10) A (2;2925) (2123, ** Zn) y 


are analytic in the stars S (2,25) (2324) S (2,292;) respectively. 
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For the proof we note that 


A (e324) = [A | = [A Coase) » 
and 


The theorem follows by iteration of Theorem 1. 

6. Two-dimensional Hadamard theorem. We can now obtain as a special 
case of Theorem 2 a two-variable extension of the Hadamard theorem. Let 
us take a specialized function of four variables; namely, a product of two 
functions each of two variables, 


(11) A(s, u, t, v) = B(s, t)C(u, 2), 

(12) B(x, = 

(13) C(x, = 

Then our previously defined diagonal function (see (9)) becomes 
(14) A ¥) = 

and our special theorem yields the following theorem: 


THEOREM 3. Let (12) and (13) be any two functions analytic in a neighbor- 
hood of the origin, and let S(B) and S(C) be their stars. Then the Hadamard 
composition 


H(x, y) = 


is analytic in the star S* =S,+S2, where S; and S2 are defined in terms of S(B) 
and S(C) as follows. A point (x, y) is in the star S, if for ali p, 0 the points 


yr'(o)e"*) 


are in S(C), where r'(p) is a positive continuous periodic function of , and 
r3(0) is a positive periodic continuous function of 0 and depends also on the 
parameter o. The star Sz is similarly defined, with the roles of r and r’ inter- 
changed. 


are in S(B) and the points 


From the statement of Theorem 3 and the manner in which it follows from 
our general theorem, one could easily obtain an n-variable extension of the 
Hadamard theorem. 
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EQUIVALENCE OF PAIRS OF MATRICES* 


BY 
MERRILL M. FLOOD 


1. Introduction. Two pairs of matrices, [A:, Az] and [Bi, Bz], with ele- 
ments in a commutative field F, are said to be equivalentt if and only ii there 
exist two non-singular matrices P and Q, with elements in F, such that 
A,=PB,Q and A,=PB,Q. 

The totality of pairs of matrices may be separated into different classes 
in such a way that all pairs in one class are equivalent to one another while 
pairs in different classes are not equivalent. The problem which arises natu- 
rally is to determine a set of invariants which will characterize the pairs in 
each class and to select from each class a canonical pair defined uniquely in 
terms of these invariants. A rational solution of the problem is one which is 
carried out completely in the field F; the invariants and canonical pairs 
obtained in such a solution will be rational. 

The rank of a pair [A1, Ae] is the maximum rank of the matrices of the 
matric pencil A = A\x,;+A2x2, where x; and x2 are indeterminates in F. A ma- 
tric pencil is said to be non-singular if it is square and of rank equal to its 
order (otherwise it is called singular), and it is said to be regular if the rank 
of either one of its coefficients is the same as the rank of the pencil. 

Non-singular matric pencils were first classified by Weierstrasst who con- 
structed an irrational canonical form defined by means of the elementary 
divisors of the pencil. Frobenius§ later gave a rational treatment of the non- 
singular case. Kronecker]! treated the singular case and gave an irrational 
canonical form. Muth{ gave a full account of the theory of pairs of bilinear 
forms as it stood at the turn of the century. De Séguier** seems to have been 
the first to give a rational treatment of the singular case. More recently, it has 
received the attention of Dickson,tt+ Turnbull and Aitken, ff Wedderburn, §§ 


* Presented to the Society, March 27, 1937; received by the editors May 25, 1937. 

t C. C. MacDuffee, The Theory of Matrices, Berlin, 1933, p. 48. 

t K. Weierstrass, Monatsberichte, Preussische Akademie der Wissenschaften, 1868, pp. 310-338. 

§ G. Frobenius, Sitzungsberichte, Preussische Akademie der Wissenschaften, 1894, pp. 31-44. 

|| L. Kronecker, Sitzungsberichte, Preussische Akademie der Wissenschaften, 1890, pp. 1225- 
1237. 

4 P. Muth, Theorie und Anwendung der Elementartheiler, Leipzig, 1899. 

** J. A.de Séguier, Bulletin de la Société Mathématique de France, vol. 36 (1908), pp. 20-40. 

tt L. E. Dickson, these Transactions, vol. 29 (1927), pp. 239-253. 

tt Turnbull and Aitken, Canonical Matrices, Glasgow, 1932, chap. 9. 

$$ J. H. M. Wedderburn, Lectures on Matrices, American Mathematical Society Colloquium 
Publications, vol. 17, 1934, chap. 4. 
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Turnbull,* Ledermann,{ Williamson,f{ and others. 

In this paper the problem of constructing a rational canonical form in the 
singular case is reduced to the consideration of the non-singular case. The 
proofs are completely rational, quite elementary, and relatively short. The 
canonical form which is obtained is defined essentially in terms of the set of 
invariants shown by Williamson{ to characterize the classes of equivalent 
matrices. The method of proof is very similar to that used by Ingraham§ in 
his treatment of the equivalence of singular pencils of Hermitian matrices. 

2. Preliminary remarks. Consider a singular pencil A =Ai%1+Aox of 
rank p(A)=r and order [6, 6’]. Set Ri=Aitu+Aot and 
where the are quantities of F such that #0. If R= and 
T =||t,,||, then the relations above may be written R= AT, and the pencils 
A and R are said to be transformable. If B is a second matric pencil, it follows 
easily that A is equivalent to B (A ~B) if and only if AT ~BT. In particular, 
there exist two quantities “, and 4, of F not both zero and such that 
p(A,f,+Aofd:) =r, and in this case the pencil R= AT is said to be regular. 

If it is desired only to obtain necessary and sufficient conditions for the 
equivalence of two pencils, then there is no loss of generality in considering 
only regular pencils. However, if a canonical form in the most strict sense is 
required, it is necessary to start with the original pencils rather than their 
regular transforms, as has been pointed out by Ledermann.|| Canonical forms 
will be constructed only for regular pencils, but the invariants used will be 
shown to afford a satisfactory classification for all pencils. It is felt that this 
solves the important part of the problem. 

3. Rational canonical form for regular matric pencils. Constant non- 
singular matrices P and Q exist § such that 


0 


PRQ = =| 


hence R~ex, + = ex1 +. 4x2 = Ro. If we set 


1, 
Ro | + %2, 
0 0 22 


*H. W. Turnbull, Proceedings of the Edinburgh Mathematical Society, (2), vol. 4 (1935) 
pp. 67-76. 

t W. Ledermann, ibid., (2), vol. 4 (1935), pp. 92-105. 

t J. Williamson, ibid., (2), vol. 4 (1936), pp. 224-231. 

§ Ingraham and Wegner, these Transactions, vol. 38 (1935), pp. 145-162. 

|| Loc. cit. 

] MacDuffee, loc. cit., p. 43. 
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it follows immediately that a2=0, for otherwise the rank of Ry would be 
greater than r, which is impossible since Ry has been assumed to be regular. 

Since 1,%:+ 41%: is non-singular, the rows of (a2: 0) must be linearly de- 
pendent on the rows of (1,%+41 di); thus there exists a matrix X2, such that 
and Necessarily then, and 
4x(1,4-+4y)~'ay2 =0. For x sufficiently large 


k 
(1,4 + = ( ; 
xX k=O x 
hence =0 for k=0,1,2,---. 

Conversely, if for k=0, 1, 2,---, then =0. 
Hence, if X21 it follows that =de and X2d2=0, 
so that the last @—r rows of ex+a are dependent on the first r rows. This 
proves the following lemma: 


Lemna A. The rank of the matric pencil 


1, 0 || dir 
| | Xe 
0 22 


is r if and only if and for k=0,1,2,---. 


Lemma A holds true only if the coefficient field F is commutative and 
has characteristic zero. For a field of characteristic p0 it is necessary to 
alter the treatment slightly. The present author has shown (Annals of 
Mathematics, (2), vol. 36 (1935), p. 865) that the matric pencil of Lemma A 
is equivalent to a pencil 


0 0 bi O dy 
0 1, O || x, 
0 O O 


where s=p(az2) and bybjibi3=0 for k=0, 1, 2,---. It follows easily that 
there is no loss of generality in considering pencils satisfying the conditions 
of Lemma A; and in this way the proofs given in this paper may be extended 
to include pencils with coefficients in an arbitrary field. This more general 
method of proof has been used by the present author in a recent paper 
(Strict equivalence of matric pencils, presented to the Society December 29, 
1937, but not yet published) treating the problem of equivalence of matric 
pencils, singular and non-singular. 

We now proceed with the construction of a canonical form for the regular 
pencil S=ex+<a. If a and @ are non-singular matrices of orders @ and @’ with 
elements in F, then ae8 =e if 


r 


1 
0 
0 
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-1 
«=| O11 an 0 
are Bor Bee 


where a2 and 22 are non-singular, and so 


| 

611031011 + 12021011 + 11012821 118128 22 

aap = 
22021011 0 


Now if the rank of a2: is 71, a2 and a; may be chosen so that 


22021011 = 
0 0 


and it follows from Lemma A that the first 7; rows of a1:4:2622 must be zero. 
Furthermore, if the rank of ay is c, it is clear that B22 may be chosen so that 
0 0 


2114, 2822 = 


0 
With this choice of a;; and 6, the pencil T=aS6 takes the form 


1,0 00 0 0 a, 
01,0 0 0 X2 de a3 
0 Xs 
0 


1,,0 0 


0 
0 000 
0-, 00 0 


0 
where h=r—n—G, =9—r—n—a, and k=0’ —r—r,—c. Finally ay and Ba 
may clearly be chosen so that the x; are all zero; then 

0 x + a2 as 

0 0 « 

1 0 0 

0 0 

0 0 


The rank of T is 7, since a and 8 were chosen non-singular, therefore 
a4 
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is regular and of rank r—r,—c;. Hence, by Lemma A, a,=0 and a,a# a;=0 for 
k=0,1,2,---. 

Consider a second regular pencil U = U,x,+ U2x2 which is equivalent to R. 
The rank of U; is necessarily r, therefore, constant non-singular matrices P» 
and exist such that =e. Then if 

| bir dis 

V = PoUQo = bxe = 

bar bee 
it follows from Lemma A that be=0. Since V~S there exist constant non- 
singular matrices x and y such that xV =Sy, which equation is equivalent to 
the relations xe=ey and xb=ay. From xe=ey it follows that #1 =yu, x1 =0, 
=0, and that x1, and are non-singular. Then from xb =ay it follows 
that = and Since X22, and yee are necessarily non- 
singular this shows that the ranks of bo: and b;2. are the same as the ranks 7 
and ¢; of de; and diz. 7; will be called the first “row invariant subrank” and c; 
the first “column invariant subrank” of R or of any pencil equivalent to R. 
It follows that constant non-singular matrices a» and 8» can be chosen so that 
W =aoV, takes a form analogous to that of T but with a; replaced by dy. 

The pencils U and R are equivalent if and only if there exist constant non- 
singular matrices p and g such that p7 =Wg. From pe=eg it is clear that 
p and g must be of the forms 


z O 


and gq=||, , 


hence pT = Wq may be replaced by the equation 


Pu Pie pis Pris Pre 0 00 0 0 tu Px 9 0 O 
bu pur prs pu prs prs bu pu ps 09 O 
Psi pss Pu Pas Pre pu pss 09 0 
0 O pu pas das Gar Gar Jas Jaa Yas 
0 O fis pss Ps Gs1 Yss 
0 O pes Pes Pes 0 Je Jos Joa Jes Yee 


This equation is equivalent to the set of relations: 
Psa = Pos = = fis = Gea = Gos = Pos = O, 
Pu = pas, $33 = dee, = ei, 
Pu = dipa, p32d3 = Pi2ds = Dipes, 
pos = bepar + dspai, Pirdi + pisde = 
+ ps2d2 = p22d3 = bepes + 
poids + poede = + 
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It follows easily that these equations have a solution such that is non- 
singular if and only if there exist constant non-singular matrices pu, pe, and 
pss Which satisfy the relations: 
Piid1 = = dbspss, 
p21d1 + pode = bepor + 
These equations may be rewritten ir the form 
poe pau || be bs | po 
O pu a, 0 0 bse 
and this is simply the condition for the equivalence of the two regular pencils 
1 O | a2 a3 
x 
0 0 a, 0 


qT! = 


1 0 be b 
m =| |= | 


0 0 b; 


The pencils JT! and W! will be called “first kernels” of the pencils R and U. 
Thus the problem of classifying singular pencils of rank r has been reduced 
to that of classifying singular pencils of rank r—ri1—c, or else to that of 
classifying non-singular pencils if a; and a; happen to be zero. 

If rj41 and c;4: are the first invariant subranks of 7’, and T‘+! is a first 
kernel of T/ for 7=1, 2, 3,---, m, and if T**+! is non-singular or zero; then 
r; and c; for 7=1, 2,---, +1 will be called the “invariant subranks” of 
R, and T+! a “kernel” of R. This proves the following: 


THEOREM 1. Two regular matric pencils are equivalent if and only if they 
have identical sets of invariant subranks and equivalent kernels. 


It is clear that the construction which leads to T can be extended until a 
rational canonical form for R is obtained. This canonical form would dis- 
play the invariant subranks and invariant factors of R. The invariant 
factors of R are clearly the same as those of any kernel of R except for 

t2 i (r;+¢;) units which would appear in the normal form of R but not in the 
normal form of any kernel of R. This demonstrates the corollary: 


Coro.iary 1.1. Two regular matric pencils are equivalent if and only if they 
have the same invariant subranks and the same elementary divisors. 


4. Transformable matric pencils. Let A = A 141+ Ao2 be an arbitrary ma- 
tric pencil, and define matrices M;,(A) and N;(A), for k=1 2, 3,---, by the 
relations 
A; Az 0 
O <A, Ae 


M,(A)=|| A: Ae |], M2(A) = 


> 
4 
4 
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A A; 0 
wa) = || 4 N2(A) = 
0 


It is convenient to denote by m,(A) and m,(A) the ranks of M;,(A) and 
N,(A) and to call m,(A) the “row singularities” and n,(A) the “column singu- 


larities” of A. 
It is obvious that equivalent matric pencils have the same row and column 


singularities. We now proceed to prove the following: 
THEOREM 2. Transformable matric pencils have the same singularities. 


Proof.* Consider a matric pencil A = A,x,+A2%2 and a non-singular, trans- 
formation of indeterminates x =/x’, or more explicitly 


= txt + = text + 
Under this transformation, the pencil A is carried into the pencil 
A’ = At = + + (t12d1 + = Af at + 


and the theorem states that m,(A) =m,(A’) and n,(A) =n;,(A’). The first of 
these equalities will be demonstrated by constructing non-singular matrices 
T* such that 


M,(A)T* = for k=1,2,3,---, 


and the second can be shown by a similar construction. 


If uo, 41, U2, , are k+1 indeterminates, then the identity 
= ug xi* + + + ug 
defines uo, ---, ux as linear combinations of uo, m, - - , and these 


may be written in either of the forms 


uf = >> or = T'n. 
j=0 
Now if 7*=|| 7% ||, then ¢+7* is a representation of the full linear group of 
all non-singular matrices of order two, and hence T* is non-singular since ¢ is. 
If (1) is differentiated with respect to x; and 2’, there results: 


( 


* I am indebted to Dr. A. H. Clifford for this proof. 


; 
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(2b) 


If (2a) is multiplied by A:, and (2b) by As, and the resulting equations are 
added, it follows that 


= (ug Ae) Artud Ae) xt As) xe *', 
and from this identity that 


k-1 
ul Ay + = + for i=0,1,---,k—1. 
i=0 


These identities may be written in the form 


M;(A)u’ = T*'M,(A "\u 


or, since u’ = T*u, in the form 
M,(A)T*u = T*'M,(A’)u. 
The indeterminate vector « may be cancelled in this equation and so 


M,(A)T* = T*"M;(A’) 


as was to be shown. 
It is convenient, at this point, to state the following: 


Lemma B.* The invariant factors of transformable matric pencils are con- 
nected by the same transformation of the indeterminates x, and x2 as the pencils 
themselves. 


5. Equivalence of general matric pencils. Williamsont has shown that 
the minimal numbers of a matric pencil can be expressed in terms of its singu- 
larities, from which follows the theorem: 


THEOREM 3. Two matric pencils are equivalent if and only if they have the 
same singularities and invariant factors. 


This theorem may also be proved with the help of Theorem 2 and Lemma 
B by showing that the invariant subranks of a regular matric pencil can be 
expressed in terms of its singularities. This will now be done for the row sub- 
ranks, and an analogous treatment of the column subranks would complete 
the proof. There is no loss of generality if the pencil is taken to be in canonical 
form. 


* See MacDuffee, loc. cit. 
t Loc. cit. 


& 
fy 
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Consider the regular canonical pencil W =e .x+<a of rank r, with row sub- 
ranks r;, and column subranks c,. Set 
1,0 0 0 0 
E.=|}0 0 0 0 ||, and | 
000 00 1, 


E, 0 
A, F, 


Then square matrices A; may be defined by the relations 
0 E 0 
Ay =||0 Ax Fe for k=1,2,--:,9, 
0 0 0 
where A 4,4: is the canonical kernel of S and E,AiF,=0 for j=1, 2,---. 
Of course r; is the rank of E,, and c; is the rank of F,. 
Now, by definition, 
0 £0 0 0 


0 
E,0 0 
= => =f 


If the first column of this matrix is multiplied by —F; and added to the last 
column, and then the third row is multiplied by — £, and added to the first 
row; since £,/,=0, it follows that 


0 —E\A; 
r 
1441 
1 A, 


In similar fashion, it is easily shown that 
Ey 

E\A, 
m(W) = kr+p|| for k =1,2,3,---. 


Since E,A/F;=0, it follows that 
0 0 

Av =||0 Fall, 


0 0 
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and hence that 


k-1 


and a simple induction now shows that 


k-1 
m(W) = kr + 
i=1 
This equation provides the necessary relationship between the row singulari- 
ties and subranks of W and the proof of Theorem 3 is complete. Of course, 
k-1 
and c5o=m—r— 
j=l j=l 
are the inverse equations which express the subranks in terms of the singu- 
larities. 
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FIXED POINTS UNDER TRANSFORMATIONS OF 
CONTINUA WHICH ARE NOT CONNECTED 
IM KLEINEN* 


BY 
O. H. HAMILTON 


1. Introduction. Ayrest has shown that if M is a compact continuous 
curve in the plane which does not separate the plane and T is a reversibly 
continuous transformation of M into a subset of itself, then T leaves some 
point of M invariant. He also proves similar theorems for special cases of 
non-planar continuous curves. The purpose of this paper is to extend these 
results to certain types of continua which are not connected im kleinen. 

2. Preliminary theorem and lemmas. We prove first the following pre- 
liminary theorems and lemmas. 


THEOREM I. Jf M is a compact continuum in a metric space but is not an 
indecom posable continuum, and if T is any reversibly continuous transforma- 
tion of M into a subset of itself, then some proper subcontinuum N of M contains a 
point of T(N), the image of N under T. 


Proof. If 7 carries M into a proper subset of itself, then 7(M) contains 
its image T?(M) and the theorem is true. Suppose then that T carries M into 
itself. It can easily be shown that if M is not indecomposable, it does not 
contain two mutually exclusive composants. The transformation T carries 
a composant of M into a composant of M. It has been shown that a com- 
posant{ of a continuum is the sum of a countable number of continua 
Ni, No, Ns, , where for each integer 7, N; contains N;-1. 

Let V be any composant of M, and let T(V) be its image under T. Then V 
and 7(V), by what was said above, must have a point in common. Let V be 
expressed as >, ,V;, where for each integer i, V; is a continuum which con- 
tains V;.. Then 7(V)=)>-%.,7(V;) where for each integer i, T(V;) is the 
image of V; under T and is a continuum which contains 7(V;_:). For some 
integer 7, V; contains a point of 7(V,), for if we suppose the contrary to be 
true, it is obvious that V contains no point of 7(V), contradicting the fact 


* Presented to the Society, September 10, 1937; received by the editors May 29, 1937. 

t W. L. Ayres, Some generalizations of the Scherrer fixed-point thecrem, Fundamenta Mathe- 
maticae, vol. 16 (1930), pp. 333-336. 

t See R. L. Moore, Foundations of Point Set Theory, American Mathematical Society Colloquium 
Publications, vol. 13, p. 75. 
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that V and 7(V) have a point in common. V; is therefore a continuum 
satisfying the conclusions of the theorem. 


Lemma 1. If M is a compact continuum in a metric space and M is not an 
indecom posable continuum and contains no continuum which is the sum of two 
continua whose common part is disconnected, and if T is a reversibly continuous 
transformation of M into a subset of itself and neither T nor T—' carries a proper 
subcontinuum of M into itself, then there is a proper subcontinuum Ly of M such 
that Lo contains a point of T(Lo) but such that for n>1, the common part of Ly 
and T"(Ly) is vacuous. 


Proof. By Theorem I, some proper subcontinuum Ky of M contains a 
point of its image 7(Ko). For each positive or negative integer n, let K, desig- 
nate 7"(K,). By a hypothesis of the theorem, K, is not identical with Ki, 
for suppose Ky= K,. Then Ko is a proper subcontinuum of M which is carried 
into itself by 7. Furthermore Ko: K, is not identical with Ky or with K,; for 
suppose Ky: K, is identical with Ko. Then T carries K, into a subset of itself. 
Consider the continuum A =|[7_,K,. Then T(A) =A, and T carries A, a 
proper subcontinuum of M, into itself. Similarly, if Ko- Ky, is identical with Ki, 
T- carries a proper subcontinuum of M into itself; thus we have a contra- 
diction of a hypothesis of the theorem. 

There exists an integer r such that the []{_,K; is not vacuous but such 
that []}*}K; is vacuous. This can be shown as follows. Ko- K; is not vacuous. 
Suppose | [{_, K; contains a point for each integer s. Then B=][*_,K, is not 
vacuous and is a proper subcontinuum of M which is carried into itself by 7, 
which is, again, a contradiction of a hypothesis of the theorem. Let Ly be 
the continuum | [/-}K;, where r is an integer such that []/_,K; is not vacuous 
but is vacuous. For each integer i, let L;= T‘(Lo). Then Li is Ki. 
Ly-L; is not vacuous since Lo-L,=]]{.,.Ki which is not vacuous. Lo- Ls is 
vacuous. For suppose LZ» contains a point of Zz. Then [[{-}K; contains a 
point P of []{*}K;. It follows that P is in []{*}K;, which by hypothesis is 
vacuous. Furthermore Ly contains no point of L,, r>2. For suppose the con- 
trary and that s is the smallest integer greater than two such that Ly contains 
a point of L,. Consider the two continua, and +LZ,-1. 
Their common part is the sum of the two mutually exclusive continua, Lo- I 
and L,_;-L,. Then }-}_,L; is a subcontinuum of M which is the sum of two 
continua whose common part is disconnected. This is a contradiction of a 
hypothesis of the theorem. The lemma is therefore true. 


Lemma 2. If, in a metric space, M is a compact non-degenerate continuum 
but is not an indecomposable continuum and does not contain a continuum 
which is the sum of two continua whose common part is disconnected, and if T is 


a 
«ff 
a 
3 
4 
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a reversibly continuous transformation of M into a subset of itself, then T carries 
some proper subcontinuum of M into itself. 


Proof. Suppose T does not carry a proper subcontinuum of M into itself. 
Then by Lemma 1, there exists a sequence of subcontinua Lo, Li, Le, - - - 
such that (1) for each positive and negative integer m, L, is T*(Lo), (2) Ln 
and L,4: have a point in common, but L, and Ly,,, |r| >1, do not have a 
point in common. Let V, be an irreducible continuum from L_,-L,y to Lo: Li 
which is a subcontinuum of Lo. Let V,, for each positive or negative integer, 
designate 7"(V,). Then since 7 is reversibly continuous, it can be easily 
shown that V, is irreducible from L,_,-L, to Lz: Ln4:. Let Po be a point of Vo 
not in L_,- Lo or Lo- Ly. There is such a point for otherwise L_, would contain 
a point of L;. For each positive or negative integer n, let P, designate 7"(Po). 
Let M, be an irreducible continuum from P», to P; which is a subset of 
Vot+Vi+Lo-Li. Then since T is reversibly continuous, it follows that (1) M, 
for each integer m is irreducible from P, to Pn41, (2) Mn: May: is not vacuous, 
but M,-M,4,, |r| >1, is vacuous, (3) (M;+M;,_1)-V; is an irreducible con- 
tinuum from L;_,-L; to L;: 

Since M is not indecomposable, M is the sum of two proper subcontinua 
H and K. One of these continua contains points of infinitely many of the 
continua Mo, M,, M2, - - - . Suppose H contains points of infinitely many of 
these continua. Let 7 be the smallest integer greater than or equal to zero 
such that H contains a point of M;. Let r be any integer greater than 5 such 
that H contains a point of M;,,. Let H’ designate the common part of H and 
the continuum >-/*5M,. But H’ is itself a continuum since the common part 
of two subcontinua of M cannot be disconnected. H’ must contain points of 
these continua separates M; from M;,, in >-{7;M,. It follows that H’ con- 
tains Pj,2, Piss, -- - , Pizr-2, since for each integer s, P;,, lies in a subcon- 
tinuum of which is irreducible from Lj,,1-Lj4,to Then 
H’ contains each of the continua M - - - , Mj4,-3; since each of these 
continua is irreducible between two points of )-4*7;7P;. Sincer can be taken 
arbitrarily large, it follows that H’ contains M;,, for each integer n greater 
than 1. 

Consider the continuum C=)>_,,.M;. It is obvious that T(C) is a sub- 
set of C. It follows that C is identical with M, for otherwise C contains a 
proper subcontinuum of M which is carried into itself by 7, contrary to hy- 
pothesis. But by the argument given above C is a subset of H; therefore H 
is identical with M. This contradicts the assumption that H is a proper sub- 
continuum of M. The lemma is therefore true. 
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3. A general theorem on fixed points. We may now prove fhe general 
theorem: 


THEOREM II. If M is a compact continuum in a metric space, and if M does 
not contain an indecomposable continuum and does not contain a continuum 
which is the sum of two continua whose common part is disconnected, then every 
reversibly continuous transformation of M into a subset of itself leaves some point 
of M invariant. 


Proof. By Lemma 2, T carries some proper subcontinuum MM, of M into 
itself. Let Mi, M2, ---, Mu, Moss, - - - bea well ordered sequence 6 of sub- 
continua of M having the following properties: (1) for each ordinal number 0 
which has an immediate predecessor, Mg is a proper subcontinuum of M¢_; 
which is carried into itself by 7; (2) for each ordinal number ¢ which has no 
immediate predecessor, M, is a proper subcontinuum of M, which is carried 
into itself by 7, where a is any ordinal which is less than ¢; (3) if M) is any 
non-degenerate continuum which belongs to the sequence 8, M)4,: is a proper 
subcontinuum of M) belonging to the sequence 8. Since M is metric, it is 
completely separable, and it follows that the sequence 6 thus defined is count- 
able. Therefore there exists a countable simple subsequence My, Mno, - - 
running through §. That is, if M, is any element of 8, there is an integer 7 
such that M,,; is a subcontinuum of M),. Since M is compact, the point set 
P=]];.,M,; is a compact continuum and is carried into itself by 7, since 
each of the continua M,, is carried into itself by 7. But P cannot be a non- 
degenerate continuum, for if it is, it has a proper subcontinuum which be- 
longs to the sequence §; and this contradicts the definitions of P and of the 
sequences 6 and M,,. It follows that P is a point of M, invariant under T. 

4. Applications to continua in the plane. We have now the following ap- 
plications: 


THEOREM III. If M is a compact continuum in the plane which contains no 
indecomposable continuum, which does not separate the plane, and which con- 
tains no domain, then every reversibly continuous transformation of M into a 
subset of itself leaves some point invariant. 


Proof. It follows from a theorem proved by S. Janiszewski* and also by 
Miss Mullikanf that a sufficient condition that a compact continuum M 
separate the plane is that M be the sum of two continua whose common 


* S. Janiszewski, Sur les coupures du plan faites par les continus, Prace Matematyczno-fizyczne, 
vol. 26 (1913), pp. 11-63. 

¢ Anna M. Mullikan, Certain theorems relating to plane connected point sets, these Transactions, 
vol. 24 (1922), pp. 144-162. 
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part is disconnected. Then if M does not separate the plane and contains no 
domain, no subcontinuum of M separates the plane, and the hypotheses of 
Theorem II are satisfied. The conclusions of Theorem III then follow. 


THEOREM IV. If D is a bounded simply connected domain in the plane which, 
together with its boundary, does not separate the plane and whose outer boundary 
M contains no indecomposable continuum, then every reversibly continuous 
transformation of D into itself leaves some point of D invariant. 


Proof. Without loss of generality we may suppose the boundary of D to 
be identical with its outer boundary, for the compact complementary domain 
of the outer boundary M of D is itself a simply connected domain which is 
a subset of D and whose boundary is M. Carathéodory* has shown that there 
exists a conformal and reversibly continuous transformation 7; of D into the 
interior J of a given circle J, that there exists a reversibly one-to-one cor- 
respondence, also designated by 7:, between the prime ends of D and the 
points of J, and that this correspondence is characterized as follows: If 
P,, P2, Ps, -~+~ is a sequence of points of D converging to a prime end Ep 
of D (using Carathéodory’s definition of convergence), then the sequence of 
points 7;(P;), 7:(P2), 7:(Ps), - - - converges (in the usual sense) to the point 
P of J with which E> is associated by this correspondence; and conversely, if 
(1, Qe, Qs, - - - is a sequence of points of J converging to a point Q of J, then 
the sequence of points 7;7'(Q;), Tr1(Q2), Tr*(Qs), - -- converges to the 
prime end Eg with which (Q is associated by the correspondence 7;. Let T be 
any reversibly continuous transformation of D into itself. It can be easily 
shown that 7 carries a prime end of D into a prime end of D. Let T2 be a 
transformation of the continuum J-+/ into itself defined as follows: If P is 
any point of J, let T:(P) be the point 7,[7(77"(P)) |. If P is any point of J, 
let P;, P2, - - - be a sequence of points of J converging to P. Let Ep be the 
prime end of D associated with P by 7. Then the sequence 7;'(P;), Tr1(P2), 
T7(P;), - + + converges to Ep. Since T is reversibly continuous, the sequence 
T(Tr'(P,)), T(Tr(P2)), - - converges to the prime end T(£p) into which 
Ep is transformed by T. It follows that the sequence of points of J, 
|, Ti [T(T7(P2)) |, - - - converges to a point of J. That point 
is defined as T:(P). Since 7;[7(77'(P,)) | is defined as T2(P;), and since T is 
reversibly continuous on D, and 7; is a reversible continuous transformation 
of D into J, it follows that T2 is reversibly continuous on J. Also from the 
definition of T; it follows that if Pi, Pz, Ps, - - - is a sequence of points of J 
converging to a point P of J, then the sequence 72(P:), T2(P2), T2(Ps), - - - 


* C. Carathéodory, Uber die Begrenzung einfach zusammenhdngender Gebiete, Mathematische 
Annalen, vol. 73 (1912), pp. 323-370. 
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converges to 7;(P), and the sequence 731(P;), Tx1(P2), Tx1(Ps), - - - con- 
verges to Tz1(P). Suppose that Qi, Qe, Qs, - - - is a sequence of points of J 
converging to a point Q of J. For each positive integer n, let Z, be a point of J 
such that each of the distances d(T,, Q,) and d[T2(Z,), T2(Qx)] is less than 
1/n. That such a point Z,, exists is shown as follows: There exists a sequence 
of points Zm, Zns, Converging to Q,. Then by the discussion above, 
the sequence 72(Zn1), T2(Znz), T2(Zns), - converges to 72(Q,). There exists 
an integer i such that d(Z,:, <1/n and d[7T2(Z,;), T2(Qn) | <1/n. Then Za; 
has the property required of Z,. We see then that the convergence of the se- 
quence Q), Q2, - - - to Q implies the convergence of the sequence Z;, Z:, - - - 

to Q. This in turn implies the convergence of 72(Z:), T2(Z2), - -- to T2(Q) 
and therefore the convergence of T2(Q1), T2(Qz), - - - to T2(Q). Similarly the 
sequence of points Tz 1(Q;), Tx1(Q2), - - - converges to Tz1(Q), and is a 
reversibly continuous transformation of the continuum J-+/ into itself. By 
a fundamental theorem on fixed points, T; leaves some point of J+J fixed. 
If T2 leaves a point of J fixed, then obviously T leaves a point of D fixed. If 
T2 leaves a point P of J fixed, then T carries some prime end Ep of D into 
itself. Let Np be the continuum in the boundary of D associated with the 
prime end £> in the sense that every sequence of points of D converging to Ep 
has a subsequence which converges in the usual sense to a point of Vp. Then 
T carries Np into itself. N. E. Rutt* has shown that a necessary condition 
that such a continuum as Np shall be the whole boundary of D is that the 
boundary of D shall be indecomposable or the sum of two indecomposable 
subcontinua. Since by hypothesis the boundary of D contains no indecom- 
posable continuum, NV is a proper subcontinuum of the boundary of D. Np 
then is a compact continuum which does not separate the plane and which con- 
tains no domain. It satisfies the hypothesis of Theorem ITI; therefore T leaves 
some point of Np fixed. Thus, in any case, T leaves a point of D invariant. 

5. An example. There exist continua which admit of no continuous trans- 
formation into themselves except the identity. We give below an example of 
a compact acyclic continuous curve in the plane having this property. 

Let A; be any arc in the plane of length 1. Let Az be the sum of A; and 
two arcs of length 1/2 having no point in common with each other or with A; 
except that each has one end point at the midpoint of A;. For each integer n, 
let E,_; designate the set of points of A,_: of Menger order greater than two. 
Let A, be the sum of A,_; and ” arcs of length 1/2” having no point in com- 
mon with each other or with A,_; except that they all have in common one 


* N.E. Rutt, Prime ends and indecomposability, Bulletin of the American Mathematical Society, 
vol. 41 (1935), pp. 265-273. 
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end point which is the midpoint of a component of An»_1— E,_; whose length 
is not less than the length of any other component of A,1—E,-1. Let 
M =>_*.,A,. Then M is obviously an acyclic continuous curve. Let K be the 
set of points of M of Menger order greater than two. From the definition of 
M it follows that no two points of K are of the same Menger order and that 
K is everywhere dense in M. Let T be any continuous transformation of M 
into itself. T carries a point of given Menger order into a point of the same 
Menger order and therefore leaves each point of K fixed. Since K is every- 
where dense in M, T leaves each point of M fixed. 

It should be noted that M is homeomorphic with a proper subset of itself. 
E. W. Miller* has given an example of an acyclic continuous curve which is 
not homeomorphic with any proper subset of itself, but which however does 
not have the property of the example we have given, since his acyclic con- 
tinuous curve contains an arc which contains no points of Menger order 
greater than two. I have not been able to find an example of an acyclic con- 
tinuous curve in the plane whose only transformation into a subset of itself 


is the identity. 
* E. W. Miller, The Zarankiewicz problem, Bulletin of the American Mathematical Society, vol. 38 
(1932), pp. 831-834. 
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THE TWO CONFORMAL INVARIANTS OF FIFTH ORDER* 


BY 
EDWARD KASNER 


In the problem of classifying curvilinear angles in the plane with respect 
to the group of conformal transformations, the simplest invariant (other than 
angular magnitude @) which occurs is foundjin the case of a horn angle of 
first order contact (formula (1)). This is a differential invariant of third order, 
since it involves third derivatives of the two curves composing the angle. 
There are no invariants of fourth order (nor of any even order), but there 
are known to be just two of fifth order, arising in the cases of the horn angle 
of second order contact and the general right angle, respectively.| The 
main object of this paper is to determine the two invariants of fifth order 
explicitly. The results appear in formulas (15) and (23) below.f{ It is shown 
elsewhere§ how the latter of these expressions can be derived from the for- 
mer through conformal symmetry. As a corollary we obtain in §3 the in- 
variant of lowest order of a single curve under the inversion group, first found 
by G. W. Mullins. Generalizations to the case of angles drawn on a curved 
surface have been given by G. Comenetz.|| 

I wish to acknowledge the valuable assistance of Miss A. Vassell and J. De 
Cicco in calculating the invariants, and of G. Comenetz in connection with §3 
and the use of oriented curves. 

1. Horn angle of second order. We use the term curvilinear angle for the 
figure formed by an ordered pair of oriented, analytic curve-elements having 
a common point O, the vertex of the angle. — 

The symbol 7 is employed for the curvature of a curve, y’ for the deriva- 
tive dy/ds with respect to arc-length s, y’’ for the second derivative d*y/ds?, 
and so on, Given a curvilinear angle, the symbols, 7, yi, yi’, - - - denote 
the curvature and its successive arc-length derivatives for the first side of the 
angle, evaluated at the vertex O; and y2, yz’, are the corresponding 

* Presented to the Society, September 10, 1937; received by the editors June 11, 1937. Abstract 
in Science, vol. 82 (1935), p. 622. 

t E. Kasner, Conformal geometry, Proceedings of the Fifth International Congress of Mathe- 
maticians, Cambridge, 1912, vol. 2, p. 81. 

t The next simplest invariants, of seventh order, would be encountered in dealing with the horn 
angle of third order contact and the general 60° angle. The latter is being studied by S. H. Gould and 

. P. Doria. 
‘ § See } E. Kasner, Conformal symmetry (Schwarsian reflexion), Annals of Mathematics, vol. 38 


(1937), pp. 873-879. 
|| Bulletin of the American Mathematical Society, abstract 42-11-399. 
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quantities for the second side of the angle. The conformal invariants of 


curvilinear angles given below are all stated in terms of the metric invariants 


A curvilinear angle is called a horn angle if the two curves coincide in 


direction at the vertex, and is described as being of kth order contact 
(k=1, 2,--.-) if the order of contact of the curves at the point of tan- 
gency is exactly k. A horn angle of kth order contact is called an Hi4;. The 
conformal invariant of an He, as the author has shown, is given by* 


v2 | 


1 
(v2 — 


To derive the analogous invariant for an H; we proceed as follows. 

Suppose that an H; lying in the (x, y)-plane is transformed conformally 
into an H; lying in the (X, Y)-plane. We may assume that the coordinates are 
so chosen in each plane that the vertex of the angle is the origin and the initial 
tangent direction is that of the positive axis of abscissas. Then in the (x, y)- 
plane the first side of the angle is given by 


(2) y = dex? + asx? +--- 
and the second side by 
(2) y= 


Here a2=a/ and a;~aj, because the contact is of second order (exactly). In 
the (X, Y)-plane the first and second sides of the transformed angle are given 
respectively by 


(3) Y = AoX?+A3X'+--- 
and 
(3’) Y 


with A,=A/ and A;#Aj. (All the curves are oriented to the right.) 
If we set z=x+iy and w=X-+iY, the conformal transformation can be 
represented in some neighborhood of the origin by 


(4) w= 


where the coefficients c, are complex (c,=a,+i8,), and c is not zero. As the 
initial direction of the positive axis of abscissas remains fixed, c is neces- 
sarily real and positive; c,=a,>0. To simplify the calculations we shall as- 


* See Cambridge paper, and papers by Kasner and Comenetz, Proceedings of the National 
Academy of Sciences and the American Mathematical Monthly, 1936-1938. 
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sume at first that c,=1, so that instead of (4) we have 
(4’) w= 2+ + 


It will be easy to allow for the more general transformation (4) later. 
Separating (4’) into real and imaginary parts, we obtain the equations of 
the transformation in the real form 
X = + — y*) — 2Boxy + — 3xy?)+---, 


(5) 
Y = y+ + Bo(x* — y*) + — y*®) 


Substituting (5) into (3) and eliminating y by means of (2), we derive an 
identity in x. If we equate the coefficients of x”, - - - , x®, respectively, and 
solve for Ao, - - - , As, after some computation we find the following relations: 
Az = a2+ Bo, 
(62) As = a3 + Bs — 2a2B2, 
(63) Ag = — + 38202? + (a3 — ae? + 482? )a2 + Bs — 
— 3a283 + Sa? Be, 
(64) As = ds — 2a2d4 + 882203 + (ax? + 482? )a3 + 
+ (38s 6a282) a2? + (2a4 — Gara; + — 24028? + 128283)a2 
+ Bs — — — 30383 + 1228203 + Yar? Bs — 1402° Be. 
Of course the corresponding relations (6; ) —(6/) for the second side of the 
angle are obtained from these by replacing a2,---, a, A2,---, As by 
Ag, respectively. 
In order to find invariants we must eliminate the coefficients a,, 6, of the 


transformation. To eliminate 8; we subtract (6,) from (6/), recalling that 
az. The result is 


(7) AAs = Aas — 2acAa, + 8B2a2Aa3 + (ax? + 487 )Aasz, 

where AA ;=A} — Aj, Aa;=a} —a; (j=3, 4, 5). Similarly we eliminate 6, and 
B; by forming the differences 

(8) AA, = Ads — aAa;, 

(9) AA; = Aa;. 


Finally a2 and f are eliminated by substituting from (6;) and (8) into (7). 
Rearranging and using (9), we find that 


(10) AA;AA, (AA 4)? 4A? (AA3)? = Aa;Aa; — (Aa,)? 4a? (Aas)?. 


Equations (9) and (10) exhibit two quantities which are invariant under 
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the special transformation (4’). If we follow (4’) by a magnification 
(11) w= a,;> 0, 


the resultant is a general conformal transformation of the type (4). It is easy 
to see that the effect of (11) on a curve (2) is to divide the coefficient a, by 
a;"~!, Consequently (9) and (10) still hold under the general transformation 
(4) provided we insert the factors 1/a,? and 1/a;° in the right members of (9) 
and (10), respectively. 

The coefficients of the curve (2) are expressed in terms of the curvature 
and its arc-length derivatives at the origin by the standard formulas 


(12;) 2a2 = 
(122) 6a3 = vi, 
(123) 24a, = yi’ + 3y7?, 


(12,4) 120a; + 19y? yi 


Using these formulas and the similar ones for the curve (2’), we find that the 
right members of (9) and (10) can be written, except for numerical factors, as 


(13) 
and 
(14) A(yd — vi — 1") — — yt’)? — — v1)’, 


respectively, where y =7:1=‘2. Now the Jacobian at O of the transformation 
(4) is a. Consequently we can state this theorem: 

THEOREM I. The quantities (13) and (14) are relative invariants of a horn 
angle of second order contact under conformal transformation. They are of 
index 1 and 3 respectively. 

In particular the sign of (13) is invariant, since the Jacobian is positive. 
The fundamental theorem is as follows: 

THEOREM II. The quantity 


(v2 vi 


(15) 


and the sign of yi —yi are invariants of a horn angle of second order contact 
under conformal transformation. Conversely, if two such angles agree in the 
value of (15) and the sign of (13) they can be made to coincide by a conformal 
transformation (at least to any finite order of contact).* 


* A similar theorem holds for an Ho, the value of (1) and the sign of y2— being the correspond- 
ing invariants. 
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The first statement follows from Theorem I. For the proof of the second 
(converse) statement we refer to the Cambridge paper, where the method of 
conformal symmetry is applied. 

2. General curvilinear right angle. A curvilinear right angle is said to be 
a “general” right angle provided y/ +yi ~0.* 

Let the first side of the angle be given by (2), and the second side by 


(16) x = bey? + 

(oriented upwards). Let the transformed first side be given by (3) and the 
second side by 

(17) X = + BY2+---., 


Again we begin with a transformation of the special type (4’). 

In deducing the equations analogous to (6;) —(6,) for the second side of 
the angle, we can avoid a new calculation if we observe that (4’) can be writ- 
ten as 


1 1 
Y + iX = (y + iz) — iB2)(y + += — iBs)(y + 
This shows that if we interchange X and Y, x and y, A and B, and a and 3, 
and furthermore replace az, B2, as, Bs, a4, Bs, a, Bs, by —B2, —a2, —as, Bs, 


Bs, 4, &s, —Bs, respectively, then the calculation that was made for (61) — (6,) 
will turn into the calculation which we now require. Consequently to obtain 
the new relations we need only make the above interchanges and replace- 
ments in equations (6:) — (6s). The result is 


(18;) Bz = be — 

(182) Bs = b3 + Bs — 2aBe, 

(183) Bs = b4 + Babs — + (— a3 + — Bo?) de 
+ a4 — 2a2a3 + 38283 — Sah? , 

(18,4) Bs = bs + 2B2bs — + + Bo? )bs — 2Bab2* + (383 — 6a282)d2? 
+ (284 — 6Bras — 120283 + — 482°)be — Bs + 48204 
+ 2a284 + 3a383 — 12a2Bea3 + 98? — 


To eliminate 8; we add (6,) and (18,). Then we subtract 2A2B,+2B.A,from 
the sum to eliminate a, and 6,. Next the subtraction of 3(A?? +B.?)(A3+B;)/2 


* An equivalent definition is that the curve obtained on reflecting the first side of the angle in 
the second side (by conformal symmetry or Schwarzian reflection) and then reversing orientation, 
forms an H; with the first side. 
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removes Finally we add —5(A? —B;?)(A3;—B;)/2+2B:A3 +2A2B. In 
this manner we obtain two expressions which are invariant under the special 
transformation (4’), namely, 


(19) bs 
(20) bs 4b? bs + a? bs + b?? a- 4a? a3 + 2a2b3* + 


Under the general transformation (4), (19) would be multiplied by 1/a;* and 
(20) by 1/ay'. 

In introducing curvatures for the vertical curve (16) we must be careful to 
note that the proper formulas are the negatives of (12;) — (124) ;thus 2b.= —7z, 
6b; = —7y7, and so on. Substituting in (19) and (20) we find 


(21) vi tri, 
(22) — — + vari’) — v2 + — + 


THEOREM III. The quantities (21) and (22) are relative conformal invariants 
of any curvilinear right angle. They are of index 1 and 2, respectively. 


THEOREM IV. The quantity 

(vy? + 1)? 

and the sign of yi +i are conformal invariants of a general right angle. Con- 

versely, if two such angles agree in the value of (23) and the sign of (21), they can 


be made to coincide by a conformal transformation (at least to any finite order 
of contact) .* 

The proof of the converse portion of the theorem is given in the Cam- 
bridge paper. 

3. Inversive invarients. An invariant under the conformal group is of 
course also an invariant under the inversion group, defined by 


az+)b 


c+d 


(23) 


* If the orientation of the second side is reversed, a general right angle becomes a general 270° 
angle. The effect of reversing orientation is to change the signs of 2, v2”, y2i¥,--+ but to leave 
v2’, v2", +++ unaltered. It follows that if we replace y2 by —y2 and 2’ by — y2’’, (23) becomes the 
invariant of a general 270° angle. The same replacements convert (1) and (15) into the invariants of 
(curvilinear) straight angles of first and second order contact respectively. 

All the invariants mentioned in this paper apply also to angles whose sides merely possess fifth 
derivatives (for (1), third derivatives) at the vertex. This follows from the fact that such an angle 
can be approximated to fifth (or third) order by an analytic angle, and the fact that a conformal trans- 
formation preserves contact. The converses in Theorems II and IV are true only for analytic angles, 


however. 
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since the latter is a subgroup of the former. 
Consider the horn angle of second order contact formed by a curve and its 
osculating circle. The value of the invariant (15) in this case is evidently 


(24) 


where 7, y’, - - - represent curvature and its arc-length derivatives for the 
given curve. Now inversion preserves the circle of curvature. Consequently 
(24) is an invariant of a single curve under the inversion group. This was first 
proved by G. W. Mullins, who showed that (24) is in fact the inversive in- 
variant of lowest order and gave all the higher invariants.* 

The question may be raised, for what curves does the value of (1) as 
formed for the horn angle of first order contact between the curve and its 
tangent line, remain constant along the curve? (These horn angles at any two 
points of the curve would then be conformally equivalent, at least to any 
finite order of contact.) The condition is —y’/y?=const., or dp/ds=const. 
This is the intrinsic equation of the logarithmic spirals. 

The analogous question for the horn angle of second order contact be- 
tween a curve and its osculating circle can be solved by using a theorem of 
Mullins. In that case the curves are the logarithmic double spirals, which may 
be defined as the transforms by inversion of logarithmic spirals. f 

Trihornometry for general horn angles of second order contact will be 
discussed in later papers. For each order a new type of Finsler space must be 
introduced. The total geometry is non-archimedian. 


* Differential Invariants under the Inversion Group, Columbia University Dissertation, 1917, 
p. 21. The J; there is identical with (24) above (with p=1/y). 

t Or as the isogonal trajectories of a pencil of circles through two fixed points, or as the images 
of loxodromes on a sphere under stereographic projection from an arbitrary point of the sphere 
(G. Holzmiiller, Ueber die logarithmische Abbildung und die aus ihr entspringenden orthogonalen 
Curvensysteme, Zeitschrift fiir Mathematik und Physik, vol. 16 (1871), p. 269). 


UNIVERSITY, 
NEw York, N. Y. 


THE ABSOLUTE OPTICAL INSTRUMENT* 


BY 
J. L. SYNGE 


1. Introduction. An absolute optical instrument is a system of transparent 
media which gives a precise point-image of each object-point lying in a three- 
dimensional region, the law of light propagation being that of geometrical 
optics. For a long time only two absolute instruments were known, the plane 
mirror (or combination of plane mirrors) and the “fish’s eye” of Maxwell, 
which consists of a single isotropic medium of variable refractive index 


(1.1) n = noa*/(a? + 


where mo, a are constants, and 7 is the distance from a fixed point. Maxwell’s 
medium has been generalized by Lenz,{ and Boegehold and Herzberger§$ 
have given an infinite class of absolute instruments, consisting of homogene- 
ous isotropic media bounded by concentric spheres. 

Less attention has been paid to the question of the realization of an abso- 
lute instrument by a distribution of transparent media than to the relations 
between object and image which must hold if the instrument is absolute. In 
the simplest case, in which the initial and final media are homogeneous and 
isotropic, it has been shown in various ways by Maxwell, Bruns, Klein, and 
Liebmann|| that if A, B are the images of object-points A’, B’, then 


(1.2) n'A’B’ = nAB, 


where n’, n are the (constant) refractive indices of the initial and final media. 
This relation may also be written 


(1.3) [4’B’] = [4B], 
where the brackets indicate optical lengths. 


* Presented to the Society, March 27, 1937; received by the editors February 26, 1937 and June 
16, 1937. 

t J. C. Maxwell, Cambridge and Dublin Mathematical Journal, vol. 8 (1854), p. 188; Scientific 
Papers, vol. 1, Cambridge, 1890, pp. 76-79. 

t W. Lenz, Probleme der Moderne Physik, edited by P. Debye, Leipzig, 1928, pp. 198-207. 

§ H. Boegehold and M. Herzberger, Zeitschrift fiir Angewandte Mathematik und Mechanik, vol. 
15 (1935), pp. 157-178. 

|| For references and an historical account by H. Boegehold, see Czapski-Eppenstein, Grundziige 
der Theorie der Optischen Instrumente, Leipzig, 1924, p. 213 et seq. See also Carathéodory, Sitzungs- 
berichte der Bayerischen Akademie der Wissenschaften, Mathematisch-Physikalische Klasse, 1926, 
pp. 1-18. 
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Carathéodory* has extended this result to the case of more general media, 
namely, to those in which the velocity of propagation of light is the same in 
opposite directions and this velocity, considered as a function of the direc- 
tion of a ray, satisfies certain conditions of analyticity.f Further, by intro- 
ducing the idea of the field of the instrument, he made his result applicable 
to actual instruments. 

In the present paper the approach to the problem of the absolute instru- 
ment differs from that of Carathéodory. Attention is directed to the surface 
of components rather than the wave-surface, and we are thereby enabled to 
see that when the optical character of one of the terminal media is assigned, 
the optical character of the other is rather closely restricted if the instrument 
is absolute. 

Extension to the case where the space is Riemannian and of N dimensions 
may easily be made; but since the problem is one of considerable optical in- 
terest, it has been thought best not to complicate the presentation by this 
extension. The problem may also be presented as a problem in Finsler space. 

2. Fundamental theory. Hamilton’s great achievement in geometrical op- 
tics was the reconciliation, in the form of a single mathematical treatment, 
of the emission theory of light and the wave theory,f or, in the language of 
pure mathematics, the calculus of variations and the theory of contact trans- 
formations. This comprehensive view places at our disposal for the discussion 


of any problem in geometrical optics two alternative methods—the ray- 
method and the wave-method. 


The methods employed in the present paper are essentially those of 
Hamilton, although for brevity an indicial notation will be used, suffixes 
having the range 1, 2, 3, with the usual summation convention.§ Except 
for a modification of one of Hamilton’s definitions to suit present purposes, 
there is nothing in the present paper which might not have been given by 
Hamilton a century ago as an immediate deduction from his theory. The dis- 


* C. Carathéodory, loc. cit. The part of this paper containing the fundamental theorem is re- 
produced, with no essential modification, in M. Born, Optik, Berlin, 1933, pp. 61-63. 

t Professor Carathéodory has informed me by letter that in establishing the result he had ex- 
clusively in mind media of the Fresnel type, or media in which similar conditions of analyticity are 
satisfied. Unfortunately these conditions were not stated in his paper, and the reader might come to 
the false conclusion that the result follows from the single assumption that the velocity of propagation 
is the same in opposite directions. That is not so, and it is not difficult to construct artificial examples 
in illustration. Carathéodory’s method is discussed in §9. 

t Cf. G. Prange, W. R. Hamiltons Abhandlungen sur Strahlenoptik, Leipzig, 1933, Anmerkungen, 
p. 104; The Mathematical Papers of Sir W. R. Hamilton, vol. 1, Cambridge, 1931, pp. 277, 497. The: 
latter will be referred to as M. P. H. 

§ J. L. Synge, Journal of the Optical Society of America, vol. 27 (1937), pp. 75-82. 
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cussion of an absolute instrument does not seem to have occurred to him; 
the idea arose out of Gaussian optics. 

The essential parts of the theory will now be developed, care being taken 
to present them in a form applicable to an absolute instrument. Certain ex- 
ceptional features are present in such an instrument which may render invalid 
statements valid in more general cases. 

The instrument is composed of any number of media. Each medium has 
a positive medium-function v(x,, a,), a function of rectangular cartesian co- 
ordinates x, and of direction cosines a,; v is homogeneous of degree unity in 
the direction cosines. It is in general multiple-valued. If the units of space 
and time are so chosen that the velocity of light in vacuo is unity, then 1/2 
is the ray velocity in the medium for a ray having direction cosines a, at the 
point x,. (For an isotropic medium, »v is equal to the refractive index.) 

By Fermat’s principle (accepted as a basic hypothesis) rays satisfy the 
variational principle 6/vds =0, ds being an element of arc. 

The components of normal slowness corresponding to a ray through x, with 
direction cosines a, are defined as 


(2.1) o, = 00/da,. 


They were so called by Hamilton* because the vector ¢, stands normal to the 

wave and its magnitude is equal to the reciprocal of the wave velocity. 
The right-hand side of (2.1) being homogeneous of degree zero in the di- 

rection cosines, we can eliminate their ratios and obtain the medium-equation 


(2.2) or) = 0. 


If, from an assigned point x,, we measure off along each straight line with 
direction cosines a, a length 1/v, obtaining a point with relative coordinates 
a,/v, the surface so obtained is the wave-surface corresponding to x,. On the 
other hand if we draw from x, the totality of vectors a, satisfying (2.2), we 
get the surface of components corresponding to x,. The wave-surface and the 
surface of components are reciprocal surfaces with respect to the unit sphere 
with centre 


* M.P.H., p. 278; M. Herzberger, Strahlenoptik, Berlin, 1931, p. 9, calls this vector the “normal 
vector.” “Slowness vector” might be a more descriptive abbreviation for the full title namely, “the 
vector representing the normal slowness of wave-propagation.” These components are of course the 
optical analogues of the generalized momenta (,) of Hamilton’s dynamical theory. 

t It seems a pity to reject Hamilton’s very suggestive terminology, wave-surface and surface of 
components, in favor of the names indicatrix and figuratrix, which carry no intrinsic meaning. Unless 
the extension to ” dimensions appears an important advance, it does not seem historically correct 
to assign priority in the consideration of these surfaces to Minkowski and Hadamard (cf. C. Cara- 
théodory, V ariationsrechnung, Leipzig und Berlin, 1935, p. 247). Hamilton had a priority of seventy 
years, and even he assigned priority to Cauchy. 
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The optical length of any curve C joining points P, Q is 


(2.3) [PQ] J vs, 


the directional arguments of v being the direction cosines of the tangent to C, 
in the sense P to Q, and ds being a positive element of arc. Giving a weak 
variation with, in general, displacements of the end points P, Q, we know 
from the calculus of variations that 


(2.4) (= ) ) f (< ov d 
Oa, Q Oa, P Cc ds 0a, Ox, 


Suppose now that C is a ray; then, by the Euler equations, the integral van- 
ishes and we have, in the notation of (2.1), 


(2.5) 5[PQ] = — (0,5%,)p. 


The characteristic function of an instrument is a function V of the coordi- 
nates x, of a point A’ in the initial medium M’ and the coordinates x, of a 
point A in the final medium M, such that V is equal to the optical length of a 
ray joining A’ and A. The function V may not be defined for all values of 
the six variables x/ , x, corresponding to points A’, A in the initial and final 
media respectively, because there may exist no ray joining A’ and A. In 
general V is multiple-valued, since v is multiple-valued. 

We shall use accents to denote quantities pertaining to the initial medium 
M’ of the instrument; quantities pertaining to the final medium will be left 
unaccented. Different rectangular axes for x, and x, may be used for M’ and 
M respectively. Inspection of the method by which (2.5) has been established 
shows that the axes to which quantities at P are referred need not be the 
same as the axes to which quantities at Q are referred. From (2.5) we see at 
once that if we pass from a ray joining points A’, A to an adjacent ray 
joining points B’, B, the variation in V is* 


(2.6) 6V = o,6x, — of bx; . 


* This is the fundamental equation of Hamilton’s theory, called by him the equation of the 
characteristic function (M.P.H., p. 168). It is of course the optical analogue of the fundamental equa- 
tion defining contact transformations in dynamics. Hamilton did not frame his theory to take into 
account the exceptional case presented by an absolute instrument; he assumed that six arbitrary 
variations 6x,’, 5x, were permissible in (2.6), so that 

/dx,=0;, dV /dx,’ = 
This cannot be done in the case of an absolute instrument if the points x,’ and x, correspond to object 
and image respectively. The argument of the present paper, which makes no such illegitimate as- 
sumption, has already been given by Herzberger (op. cit., p. 12) in a different notation. 
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Following Carathéodory, we shall say that a ray passing through a point 
A’ of M’ with direction cosines a,’ lies in the field of the instrument if it passes 
through the instrument to the final medium M. At a given point A’ of M’ 
the directions of the rays lying in the field of the instrument will be bounded 
by a cone of unidirectional straight lines drawn out from A’. 

3. Necessary conditions for an absolute instrument. Consider the rays 
emanating in the field of the instrument from an initial point A’. As the 
medium-functions of the media in the instrument may be multiple-valued, 
these rays may pass through the final medium in a number of congruences, 
which we distinguish from one another by calling them congruences of 
different types. 

If all the rays in the final congruence of type 7 pass through a single 
point A, we may say that the instrument is absolute (type T) with respect to 
the source A’, A being the image (type 7) of A’. If the instrument is absolute 
as above with respect to all initial points and directions in the field of the 
instrument, we may say that the instrument is absolute of type T. If it is abso- 
lute for all types, we may say that it is completely absolute. 

In the present paper we shall be concerned with absolute character for 
final congruences of a definite type. For simplicity, we shall refer to an instru- 
ment possessing this absolute character as absolute, without further explicit 
qualification. 

In an absolute instrument there exists a one-to-one correspondence 


between the points of the initial and final media, A with coordinates x, being 
the image of A’ with coordinates x; . 

If, given A’, we seek points A such that a ray joins A’ to A, we find our 
choice of A restricted to that part of the final medium traversed by the con- 
gruence of rays from A’. Thus, given A’ with coordinates x; , the character- 
istic V(x,’ , x,) is defined only for a restricted range of values of x,. This range 
includes of course the image of A’; the coordinates of that image are, by (3.1), 
functions of x,/. When A is chosen at the image of A’, there is an infinity of 
rays joining the two points, but by Fermat’s principle their optical lengths 
are all the same. Hence in an absolute instrument there exists an absolute char- 
acteristic, namely, a function of the three initial coordinates, or of the three final 
coordinates, whose value is equal to the optical length of a ray joining object to 
image. This absolute characteristic will be denoted by F. 

Let us now take two adjacent image-points x,, x,+ 6x,, and compare the 
optical lengths of the rays drawn to these points from their respective ob- 
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jects. The difference between these optical lengths, by (2.6) and (3.1), is 
given by 


6F = 0,6x, — 6x} 


(3.2) Ox 

= of — 

Ox, 

where og,’ are the components of any ray (lying in the field) through the ob- 
ject x/ and a, the components of the corresponding ray through the image ~,. 
Since 6x, are arbitrary, (3.2) gives the three equations 

OF Ox 
(3.3) 

Ox; Ox, 
which are the fundamental equations of the present paper. 

The implications of (3.3) may be explored in various ways. Thus we may 
suppose the absolute characteristic F and the transformation (3.1) given, 
and inquire into the conditions imposed by (3.3) on the optical characters of 
the initial and final media. Or we may suppose that the initial and final media 
are of assigned optical characters, given by medium-functions v’(x,’,a/ ), 
v(x,, a,) or by surfaces of components 


(3.4) 0’(x;, 07) = 0, or) = 0; 


we may then inquire into the conditions imposed by (3.3) on the absolute 
characteristic F and the transformation (3.1). The latter type of approach has 
been the usual one. 

4. The optical character of one terminal medium in an absolute instru- 
ment deduced from that of the other. Let us write (3.3) as 


(4.1) Or = + b,, 


where 


(4.2) Ox; 
ars = r 
Ox, Ox, 

Let us consider (4.1) as a transformation expressing final components a, in 
terms of initial components a,’ , the corresponding object and image-points 
(A’ and A) and the absolute characteristic being given. Thus for present 
purposes a,, and 6, are to be treated as constants; (4.1) is a linear transforma- 
tion. 

Consider now the surface of components S’ corresponding to A’ (with 
equation Q’=0) and the surface of components S corresponding to A (with 
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equation 2=0). Consider any ray through A’ in the field of the instrument. 
Let its components be a;’. Then the vector drawn from A’ with components 
a, parallel to the axes of x, has its end on S’. The corresponding final ray 
has components given by (4.1). But the vector drawn from A with compo- 


2=0 


Fic. 1 


nents o, parallel to the axes of x, must have its end on S (Fig. 1). Hence we 
have the following result: 


THEOREM I. In an absolute instrument the portions lying in the field of the 
instrument of the surfaces of components corresponding to an object-point A’ and 
its image A are transformable into one another by a linear transformation, whose 
coefficients are in general functions of the coordinates of A’. 


Thus, insofar as the condition (3.3) is concerned, we may choose arbi- 
trarily the following: 

(i) the absolute characteristic F, 

(ii) the correspondence x,=x,(x, ), 

(iii) the surfaces of components (that is, the equation 0’(x,’, a; ) =0) 
throughout the initial medium. 

Then the surfaces of components for the final medium (and hence its 
medium-function v) can be found to satisfy (3.3). We have in fact by (3.3) 


OXs OF 


(4.3) o; =O, 


so that 2’(x,, =0 implies 


OX, OF 
(4.4) (xs), ) = 0; 


Ox; ax; 


a, 
\ 
a, 
| 
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these are the required surfaces of components in the final medium, (4.4) being 
immediately expressible in the form Q(x,, o,) =0. 

In particular we may observe that if there is an absolute instrument in 
which the surfaces of components in one of the terminal media are ellipsoids 
(the wave-surfaces being consequently, by the reciprocal relation, ellipsoids 
also), then the surfaces of components in the other terminal medium are also 
ellipsoids (and so are the wave-surfaces). 

More generally, if in an absolute instrument the surfaces of components 
throughout one of the terminal media are undegenerate algebraic surfaces of 
the nth degree, then the surfaces of components throughout the other termi- 
nal medium are also algebraic surfaces of the mth degree. 

The assumption of algebraic character carries with it an important impli- 
cation (as would also a suitable assumption regarding analyticity), namely, 
that a linear transformation which carries the portion of one of the surfaces 
of components corresponding to rays lying in the field of the instrument into 
a portion of the other surface of components, also carries the whole of one 
surface of components into the whole of the other. It is this fact that enables 
us to make statements about complete surfaces of components, although our 
data deal only with the portions corresponding to rays lying in the field of 
the instrument. 

5. The case where the surfaces of components in the terminal media are 
general algebraic surfaces. Let us now suppose that there is an instrument 
with given terminal media in which the surfaces of components are unde- 
generate algebraic surfaces of the mth degree. We may write 


(5.1) +--+ (tom+1 terms) = 0, (ce = Cr), 


(5.2) =1+4+ Gor + Coro, (ton+1 terms) = 0, = Gor), 


where ¢,’, - are assigned functions of x,’ , and ¢,, Crs, are assigned 
functions of x,. 
We wish to investigate the possibility of constructing an absolute instru- 
ment with these terminal media, taking into consideration the condition (3.3). 
Adopting the notation of (4.1) and substituting in (5.2), we get 


(5.3) 2= 1 + + b,) + Cra(Qriot + b,) (dsutu + b,) 0; 
this must be the same as the surface (5.1). Thus we require 


1 + + + = 6, 
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where @ is a factor of proportionality. Here we have a set of equations for 
five unknown functions of x,, namely, F, x, , 8. Omitting from consideration 
for obvious reasons the case n= 1, let us pass to the case m=2, for which the 
surfaces are quadrics. There are then ten coefficients in 2 and hence ten equa- 
tions in (5.4) for five unknowns. If m>2, there are of course more equations. 
It is therefore in general impossible to construct an absolute instrument with arbi- 
trarily assigned terminal media in which the surfaces of components (or the wave- 
surfaces) are undegenerate algebraic surfaces of the second degree or of higher de- 
gree. 

6. The case where the surfaces of components in the terminal media are 
central algebraic surfaces. By a central surface we shall understand a surface 
which has a centre at the origin; for a surface of components or wave-surface, 
the origin is the point in the medium to which the surface corresponds. 

A linear transformation (4.1) which carries a central surface into a central 
surface must lack the absolute term. Hence we have this result (of importance 
in connection with Carathéodory’s theorem and implicit in his proof, unless 
replaced by a condition of analyticity) : 


THEOREM IT. Jn an absolute optical instrument in which the surfaces of com- 
ponents (or, equivalently, the wave-surfaces) are undegenerate central algebraic 
surfaces, the absolute characteristic is a constant. This result also holds for de- 
generate central surfaces, provided that each of the separate surfaces given by 
degeneration is central. 


Under these circumstances the transformation of components is simply 


(6.1) Or = Gs 


7. The case of isotropic terminal media. When the terminal media are 
isotropic, the surfaces of components are spheres with the equations 


(7.1) Y —n? =0, 
(7.2) =¢,0, =0, 
where n’, n are refractive indices, functions of position. Applying (6.1), we 


see that if the instrument is absolute, the correspondence of object and image- 
points must satisfy 


Oxi dxf =n? 


Ox, Ox, 


(7.3) 


ats 


where 6,, is the Kronecker delta. These partial differential equations, six in 


Ox 
Ox; 
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number, are not in general soluble for the three unknowns x,/. We deduce 
from them 
Ox, Ox, 


7.4 
Ox, 


and hence 

Ox, OX, 
Oxi 
n'* Axi Oxi 
— dxjdx{ 
n> OX, OX, 


= n'*dxi dx 


n*ds? = n°dx,dx, = dxidx{ 


= 


This establishes (by a method different from, and perhaps more direct than, 
the method of Carathéodory) the equality of optical lengths for object and 
image elements in an absolute instrument for which the terminal media are 
isotropic and in general heterogeneous. A necessary condition for the abso- 
lute character of such an instrument is the applicability of the terminal 
media, considered as Riemannian spaces with metrics nds, n'ds’. 

8. The case of Fresnel media. In a biaxial crystal, or homogeneous Fres- 
nel medium, the surface of components referred to principal axes has the 
equation* 
or oF 


(8.1) a + = 0, 2= 0,0, 
l1—cfg? 1—c#q? q 


where c, are constants (the principal velocities). Cleared of fractions, this 
equation reads 


(8.2) B +1=0, 2 = 0,0;, 


where 


0 0 0 
C2°C3", Ao A33 C1°C2", 
0 0 0 

By = + C3", Boo B33 + C2", 


(8.3) 


and A°,, =B°,, =0 if m¥n. 
If we change to an arbitrary set of rectangular axes (not principal) the 
surface of components has an equation of the form 


(8.4) mnO mon mon + 1= 0, = 0,07, 


* Cf. M.P.H., p. 280; Born, op. cit., p. 224. 
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where Amn=Anm, Bmn=Bnm, these quantities being found from A%,, Bo, 
respectively by application of the rules of tensor transformation. Given an 
equation (8.4), necessary and sufficient conditions that it should define the 
surface of components in a homogeneous Fresnel medium are as follows: 

(i) The roots of the determinantal equations 


(8.5) | Amn — | = 0, | Ban — | 0, 

are of the form 

(8.6) = Ae = As = cPc?, 


where the c, are constants. 
(ii) Each of the three perpendicular directions defined by 


(8.7) AmnOn — om = 0 
for fixed r, coincides with the direction defined by 
(8.8) BinnOn — = 


for the same value of r. 

A heterogeneous Fresnel medium may be defined as a medium in which 
at each point there exists a set of axes such that the surface of components 
corresponding to that point has the form (8.1). Equivalently, we may say 
that, for a single set of rectangular cartesian coordinates throughout the 
medium, a heterogeneous Fresnel medium is one in which the surface of com- 
ponents at any point has the form (8.4), where Amn, Bnn are functions of the 
coordinates x, satisfying the conditions (i) and (ii) above, in which c, are 
no longer constants but functions of x,. We shall confine our attention to the 
undegenerate case in which cz are distinct. 

Let us now consider an absolute instrument in which the terminal media 
are Fresnel media, in general heterogeneous. Their surfaces of components 
have the equations 


(8.9) On — Binntmon +1 =0, q” 
(8.10) mon — +1 = 0, = 


Ill 
as 


Since these are undegenerate central algebraic surfaces, the formula for trans- 
formation of components is homogeneous of the form (6.1). Hence we must 
have 


(8.11) BanOmOn = BianOm On , 
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Two cases must now be distinguished, according to the way in which we break 
up (8.12): 
Case I: 


(8.13) 2 = AmnO min = ; 
q q 
Case II: 
(8. 14) = mnTm On A mnOmon = 
q q 


here 0, @ are undetermined functions of x,. 
Let us choose a pair of corresponding points (object and image) and 
choose our axes in the principal directions corresponding. Then (8.11) reads 


(8.15) Dd (c2 + c#)o2 = (cd? + 
where >> indicates a sum of three terms obtained by cyclic permutation of 
1, 2, 3. Therefore, renumbering the axes if necessary, 
(c? + c#)o? = (cz? + cz? )or?, 
(8.16) + c?)o? = (cz? + ci*)o2?, 
(cP + c?)o? = (ci? + cz?)o3?. 


Let us now consider separately Case I and Case II. 
Case I: Substitution from (8.16) in (8.13) gives 
cz? + cz? 


8.17 oj? = @) _ai?, 
( ) > + 1 


2 12 
(8.18) of? = 
C2 


From (8.17) we have 

+ cz? = P(c? + c#), 
(8.19) + = + c?), 

ci? + cz? = O(c? + 
from which it follows that 
(8.20) cl = 0c, ce = = 
then (8.18) is automatically satisfied. 

Case II: Substitution from (8.16) in (8.14) gives 


ce? + cz? 
ce? + 
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From (8.21) we have 

(8.23) + = + cf), 
+ = + c#), 

and hence 


(8.24) ci = C3 


then (8.22) is automatically satisfied. Hence we have this result: 


= C2, = $63; 


THEOREM III. Jn an absolute instrument in which the terminal media are 
undegenerate Fresnel media (in general heterogeneous), the principal velocities 
at a point in one medium are either directly or inversely proportional to the prin- 
cipal velocities at the corresponding point in the other medium. 

9. Carathéodory’s method. By (3.2), valid for an absolute instrument, we 
have 
(9.1) 6F = 0,6x, — o bx; . 

Now it is immediately obvious that if we take adjacent points A’, B’ ona 
ray in the initial medium in the field of an absolute instrument, their images 
A, B lie on the final portion of the ray in question. Taking in (9.1) 6x, to 
be the displacement from A’ to B’ and 6x, to be the displacement from A to 
B, and denoting the direction cosines of initial and final rays as usual by 
, we may substitute in (9.1) 
(9.2) bx, = jx; = a; ds’, 
where ds, ds’ are respectively equal to AB, A’B’. Thus we obtain, on division 
by ds, 

OF ds’ 


= — Ar 


Ox; ds 


(9.3) 


or, by (2.1) and the homogeneity of 2, 
OF 


OX, 


, 


ds 


(9.4) 


But 


ds’ dx;\ ds’ dx; Ox; 
(9.5) at) = te, —) =0'( 
ds ds’ / ds ds Ox, 


| 
|| 
|| 
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and (9.4) thus becomes 


Except for trivial changes in notation, this is the fundamental equation (14) 
of Carathéodory’s paper (loc. cit.). 

Let us now reconstruct Carathéodory’s theorem regarding equality of op- 
tical lengths, filling in the part of the proof not given explicitly in his paper. 
We shall assume that the instrument is absolute for rays (in the field) of a 
certain type, corresponding to a certain one of the values (v’) of the multiple- 
valued medium-function of the initial medium and to a certain one of the 
values (v) of the multiple-valued medium-function of the final medium. We 
assume that these single-valued functions v’, v are even analytic functions of 
a, , a, respectively, throughout three-dimensional spaces in which a,’ , a, are 
coordinates, except perhaps for certain singular lines through the origins. 

Let P be any point in the region of the a, space which is filled with lines 
drawn in the directions of rays lying in the field of the instrument, and let Q 
be the reflection of P in the origin of the a, space. Let C be an analytic curve 
joining P and Q. Then if we put 


where u is the parameter on C, ¢(u) is an analytic function which vanishes 
for a range of values of u adjacent to P. Therefore ¢(u) =0 along C and, in 
particular, at Q. But the first part of ¢(#) is an odd function, and the second 
part is an even function. Hence, adding and subtracting the values of (x) 
at P and Q, we get 


Ox; 
(9.7) OF /dx, = 0, a) = v' (x, 


throughout the field in which P has been taken. Applying the analytic condi- 
tion again, we see that the second relation holds for all values of a,, except 
for the excluded singularities. Hence if 5x,’ , 5x, are object and image elements 
with lengths 6s’, 5s and direction cosines a,’ , a, respectively, we have 


v(x,, = 


(x7, ds’. 


45 
Ox; 
Xs 
(9.8) , ) 
ty, —— 62, 
OX, 
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Thus Carathéodory’s method establishes the equality of optical lengths for 
corresponding elements in an absolute instrument. The method by which the 
fundamental equation (9.1) is obtained in the present paper (indeed, it is 
deducible in a couple of lines from Hamilton’s equation of the characteristic 
function, properly interpreted) appears more direct than the method given 
by Carathéodory. The latter part of the argument given above is merely the 
expansion of Carathéodory’s proof, with a full statement of the implied condi- 
tions. 

The necessary amplification of his original proof will also be found in 
Carathéodory, Geometrische Optik, Berlin, 1937, p. 70, which appeared after 
the present paper was written. Reference should be made to W. Blaschke, 
Abhandlungen aus dem Mathematischen Seminar der Hansischen Univer- 
sitat, vol. 11, 1936, pp. 409-412; although not based on the characteristic 
function, his argument is similar to that of the present paper, for he directs 
attention to the surface of components and the linear transformation (4.1). 
I owe this reference to W. C. Graustein.* 


* The last paragraph was added in proof, May 20, 1938. 
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POLYNOMIAL APPROXIMATIONS FOR ELLIPTIC 
FUNCTIONS* 
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1. Equivalences modulo y’+!. The approximations to be obtained are with 
respect to k?, where & is the modulus of the Jacobian elliptic functions. It will 
be convenient to use the arithmetical concept of congruence, applied to ab- 
solutely convergent power series in two independent variables x, y. 

All the power series F(x, y), G(x, y), H(x, y),---,P(x,¥),-- + consid- 
ered have a common maximum domain R different from |xy| =0, of abso- 
lute convergence; that is, R is the only domain of absolute convergence, 
different from | xy| =0, of all the series considered, containing R. 

The series are given initially in the forms 


P(e, 9) = ses’), P(e, 9) = 


they may be rearranged with respect to ascending powers of y; thus, for 
example, 


(1.1) F(x, 9) 


where 


t=0 
Let r be a constant integer =0. Then (1.1) may be written as 
(1.2) F(x, y) = y) + (a, y), 
in which 
F,(x, y) = y'*F(x), 
s=0 
s=0 


As a function of y, F,(x, y) is a polynomial of degree <r. The degree of the 
lowest power of y occurring in F("+))(x, y) is 20. In analogy with arithmetic, 
we write (1.2) as 


* Presented to the Society, December 28, 1937; received by the editors July 16, 1937. 
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(1.3) F(x, y) = F,(x, y) (mod y**); 


F(x, y) is the residue of F(x, y) modulo y’t!. Such residues are the polyno- 
mials (in y) with which we shall be concerned. As in arithmetic, we might 
now pass from congruence as in (1.3) to equality between residue classes. It 
is more convenient however to proceed as follows: (1.3) is written 


(1.4) F(x, y) ~ 9), 


which is read, “F(x, y) is equivalent to F,(x, y).” The properties of this equiva- 
lence follow from (1.2) or (1.3), the latter of which has the usual properties 
of a congruence relation; the conclusions are restated in the form (1.4). The 
equivalence in (1.4) has the usual properties of an equivalence relation in 
algebra. In each of (1.5)—(1.7) the first relation implies the second: 


aF (x, y) + bG(x, y) = H(x, y), 
aF (x, y) + bG,(x, y) H,(x, 


where a, } are constants; 


(1.5) 


F(x, y)G(x, y) = H(x, y), 
y)G(x, y) ~ H,(x, 

~ F(x, y)/G(x, y) = H(x, y), 
F(x, y)/G,(x, y) ~ H,(x, y), 


(1.6) 


(1.7) 


provided G(x), in the notation of (1.1), is not zero. 

From (1.5)—(1.7) we obtain the equivalence corresponding to any rational 
relation between F(x, y), G(x, y), - - - ,P(x,¥), - - - . Irrational relations also 
occur, for example (F(x, y))’, written as F*(x, y), where a is a positive rational 
number, may be expanded in the form (1.1), say F(x, y) =P(x, y); we then 
have F,*(x, y)~P,(x, y). If «<0, the equivalence holds provided F(x) #0. 

It will be seen that the residues F,(x, y), - - - obtained from elliptic func- 
tions can be constructed entirely by finite processes for r=0, 1, 2,---, so 
that we are operating essentially in the finite domain. The elliptic functions 
are the infinite series, as ro through integer values, of the polynomials. 

The functions and polynomials introduced in the following section are in- 
dispensable for our purpose; they do not seem to have been noticed before. 

2. Functions U, V ; polynomials A, B, C, D. Let a be a non-negative in- 
teger. The functions U, V are defined by 


( Uo(x) = sin x, U.(x) = (- OED! a> 0; 
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(2.2)  Vo(x) = cos x, Va(x) = (— a>o0. 
(2s)! 
If A is the operator of finite differences, and m, nm are integers >0, we define 
Vn(m) by 
(2.3) m!0n(m) = (A™x") 220 = Dy (— 1)*(m, i)(m — i)", 
i=0 


where (m, i) is the binomial coefficient m!/i!(m—i)!. For mn, v,(m) is a 
positive integer; v,(m) =0 for m>n. We shall write 
sll = 1, sim] = s(s — 1)--- (s —m+1); 


a—i 


(2.4) = 1; wa(i) = 0,4 > a; mali) = (— 1), i+ fo isa. 


i=0 


The u,(z) are further considered in (2.27)—(2.30). 
From the binomial expansion of (—1)?2¢s¢= [1—(2s+1) ]*, (2.1), (2.4), 
and the known expansion 


i=1 


we obtain 


(— 1)22°U,(x) = 


If D,* denotes the operator d‘/dx‘, then 


(— 1)‘u,(i)(2s + pm), 


t=1 


> 
sin x = — 1)*(2s + 1)! 


hence, if a>0, 


+ >> (— sin x; 


i=1 


(— 1)22*U,(x) 


therefore, if A, B are defined by 


(2.6) A,(x) = 1)‘u,(2i)x**, a>OdO, 


i=0 


<(a—1)/2 
(2.7) BAx)= (— 1)iu(2i+ a>O, 


t=0 


|| 
a 
¥ 
| 
a 
= 
ak 
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we have 

(2.8) (— 1)*2°U.(x) = A.(x) sin x — B,(x) cos x, a>od0. 
To (2.6), (2.7) we add the initial values 

(2.9) = 1, Bo(x) = 0, 


so that (2.8) holds for a=0. 
The next equations follow similarly from (2.2) and 


(2s)* = >> va(i)(2s) 4, a>0: 


i=1 


(2.10) .(x) = Ca(x) sin x + D,(x) cos x, a20; 


(2.11) Co(x) = 0, Ci(x) = — x, D(x) = 1, D(x) = 0; 
S(a+1)/2 


(2.12) CA(x)= (—1).(2i- 
i=1 

sa/2 
(2.13) = >> (— 1)iv.(2i) > 1. 

i=1 

The values of the polynomials for a=0, 1, - - - can be calculated directly 

from (2.7), (2.9), (2.11)-(2.13), but it is easier to obtain them by recur- 
rence. Where the argument x is understood, we shall suppress it, and write 
U., Va, - - +» Da for U.(x), - - - , Da(x). Primes indicate derivatives with re- 
spect to x. From (2.1), (2.2) we have at once 


(2.14) 2U == xUZ 2V a = 0; 
and by reducing U,+«V,, 
(2.15) 2Uai1 + U.— = 0,7 a20. 


Combining (2.8) with the first of (2.14), and (2.10) with the second, we find 
that 


(2.16) = Aa — XAd — XBa, Bay: = Ba — + xAg; 

(2.17) = — = (Ca + De). 

Similarly, from = V, we get 

(2.18) Ca = (— 1)(Ad + B,), Da = (— 1)*%(Aa — Ba); 

and from (2.15), 

(2.19) Aas — Aa + (— 1)92C, = 0, Bair — Ba — (— 1)*xD, = 0. 


The last give 


[July 
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Ag=t+ (- 1)*x(Co-1 Ca-2 + Ca-s (- 


(2.20) 
B=- (- 1)*x(Do-1 + Da-s + (- 1)*-"Do), 


for a>0, or for a20 if we define C,=D,=0 for s <0. 
The polynomials satisfy no linear differential equation of order independ- 
ent of a. Elimination of A or of B from (2.16) gives 


(2.21) — + + (x? + — + = 0, 
satisfied by X =A, X =B; and in the same way 
(2.22) Vaso + Vous — = 0, 


satisfied by Y=C, Y=D. 

The preceding relations give recurrences for the coefficients u, v in the 
polynomials. Thus, substituting from (2.12), (2.13) into (2.17) we get the 
recurrence, equivalent to that for the numbers A”0*/m!, 


(2.23) — iva(i) — va(i — 1) = 0; 

and similarly from (2.16), (2.6), (2.7), 

(2.24) + (i — 1)u(i) —u(i-1)=0, 
The u are expressed in terms of the v from (2.18), 

(2.25) iua(i) — ua(i — 1) = (— 1)**,(i 1), a >0; 
or from (2.20), 


sa—i+l 


(2.26) ua(i) = (— 1)? (— — 1), 1<isa. 


j=l 


For 7 fixed, (2.24) is in the standard form of the linear difference equation 
of the first order, with the restriction that uo(z) is not defined. Solving (2.24) 
we find 


and hence, since u2(2) =1, u2(z) =0, i>2, we have 


ta+1(2) (- - 1+ 3[1 + (— 1)*+1}, 


j=2 


which also is seen directly to hold for a=0, 1, and 


(2.27) Uari(t) = (1 — i)* a> i, 
jor (1 — iyi 
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At once from (2.24), 
(2.28) u.(0) = ua(1) = 1. 
Taking i =2 in (2.27), and noting that (3) =0, we get 
(2.29) 6u.(3) = — (— (—1)?—3], a>0; 
whence, with 7 =4 in (2.27), 
(2.30) 24u,(4) = (— 1)[3e — 22+ (-—1)?+6], 


and so on. 

The polynomials are easily calculated recursively from (2.11), (2.16)- 
(2.19). The first 9 in each set follow. The argument in all is x. These suffice 
for obtaining approximations to the Jacobian elliptic functions with modulus 
k up to terms of order k"®. 


=] By, = 
= Bi =x 
=1- 2? Bo=x 
= B; =x — 
=1-—2?+x! By, = x + 
= 1 — 5x1 = x — 52° + x5 
= 1 — x? + — B, = x + 10x? — 9x5 
— 70x* + 14x° B; = x — 21x* + 5625 — x7 
x? + 231x4 — 126x° + By = x + 42x° — 294x° + 20x? 


Do 
dD, = 
Deo = 
D; = 
=— 25x' — x5 Ds = — 15x? + 
=— 2+ — 1525 Ds = — 31x? + — x8 
=—x+ 301x* — 140x° + x’ = — 63x? + 350x4 — 21x 
=— x + 966x* — 1050x*° + 28x? Ds = — 127x? + 1701x* — 266x° + x8 


Although they will not be required here, it may be mentioned that gen- 
eralizations of U,(x), V.(x) to non-integral values of a, by means of contour 
integrals, have been investigated by Professor H. Bateman. These lead to 
expansions of the functions (for any a) in terms of Bessel functions; such ex- 
pansions offer a point of departure for approximating the values of the poly- 
nomials in y( =k?) next obtained as approximations in y to sn(x, k), cn(x, k), 
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dn(x, k), for large values of the x occurring in the coefficients of the poly- 
nomials. 

3. The polynomials sn,x. The modulus of sn x being k, we write sn(x, k) 
=sn(x|k?), in Milne-Thomson’s notation,* as it is usually ?, not k, that is 
given in applications, and only even powers of k appear in the power series 
expansion of sn(x, &). Finally we write k?=y, 1—y=y,=k’*, and consider 
sn(x, k) =sn(x|y) =sn x as a function of x, y. Similarly for cn Pe dn x. For the 
meaning of sn,x, - - - , see §1. We shall use the forms of the power series ob- 
tained in a previous paper. For sn x we found 


(- 1)*x2er1 
(2s + 1)! 
pols) = 1, 24*pi(s) = 32+ — 8s — 3, 
2%p_(s) = Stet! — 4(2s — 1)3%+! 4+ 32s? — 32s — 17, 
2'2p5(s) = 72#+1 — 4(2s — 3)5%+1 4+ (325% — 88s + 30)3%+ 
— 34(256s* — 1056s? + 752s + 471), 


| p> 


s=0 


snx 


the general p;(s) being of the form 
(3.1) = (2f + + Pa(s)(2j — + + 


in which P;,(s) is a polynomial in s of degree ¢ with rational coefficients. More- 
over it was shown (loc. cit., p. 846, (8)) that the p;(s) can be calculated by 
linear recurrence, and in a subsequent paperf{ numerous linear recurrences 
were given for the calculation of the coefficients appearing in the recurrences 
for the p,;(s). We may therefore consider the p,(s) for 7=0, 1,2, - - - known, 
as it is straight-forward elementary algebra to obtain a particular p;(s) by 
the means indicated. 

For the meaning of sn,x see (1.2). The sx occurring in the sin sx, cos sx ap- 
pearing in sn,« expresses a number of radians. We have 


(- 


(3.2) sno = s) ————— Qs+)! = sin x; 
= a pols) + pils)y 


Reducing the coefficient of y in the last we get (see §2) 


*L. M. Milne-Thomson, Die elliptischen Funktionen von Jacobi (5-figure tables), Berlin, 1931. 

¢ These Transactions, vol. 36 (1934), pp. 841-852, in which note the following misprints: 
p. 842 (1), for x? read x**; p. 843, last line, for 2?* read 32*; p. 844, in the expression for g3(s), for —297 
read +297; p. 844 (7), all exponents on the right should be 2s, not s. 

t American Journal of Mathematics, vol. 48 (1936), pp. 759-768. 
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sin 3x — 8U,(x) — 3 sin x 
sin 3x — 3 sin x + 4(A; sin x — B, cos x) 
sin 3x + sin x — 4x cos x; 
(3.21) sn; x = sin x + 2-‘y(sin 3% + sin x — 4% cos x). 
Proceeding to snex we have the new term in y*. The coefficient of 2-*y? is 
(- 1)*x%er! 


s=0 (2s + 1)! 
= sin 5x + 4 sin 3x — 17 sin x — 8U,(3x) + 32U2(x) — 32Ui(x). 


The new detail Ui(3x) is typical of the like in subsequent calculations. By 
(2.8), 


[52+1 — (8s — 4)32*+! 4+ 325? — 32s — 17], 


— 8U;(3x) = 4[Ai(3x) sin 3x — B,(3x) cos 3x], 


= 4(sin 3x — 3x cos 3x). 
The remaining terms are evaluated as before, and we get 


sng x = sin x + 2-‘y(sin 3x + sin x — 4x cos x) 


3.3 
(3-3) + 2-8y?[sin 52 + 8 sin 3x + (7 — 8x?) sin x — 12 cos 3x — 24« cos x]. 


These are enough to show the process. 
By (1.2) and the cited expansion of sn x we have, for r>0, 


sn, = pr(s)(— 1)* (2s + 1)!" 


and hence by (3.1) the coefficient of 2—*"y’ in sn, x is 
(- 


(3-4) x (2s + 1)! 


(27 + + P,(s)(2r +1 aye], 


t=1 


in which 
(3.5) = puolr)st + palr)s' +--+ + pals), i>0, 
the p,,(r) being rational numbers; by convention we take poo(7) = 1. From this 


we can determine the general form of sn, x. The result of substituting from 
(3.5) into (3.4) and reducing by (2.1) is 


(3.6) sin (2r + 1)x+ > > + 1 — 22)x). 


t=1 
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To express this in the same form as (3.3) we apply (2.8), noting that poo(r) =1, 
and defining the polynomials a, 8 by 


t 


ar, x) = (— + 1 — 22)x), 


j=0 


Bi(r, x) = > (— +1— +20. 


j=0 
Then the whole expression in (3.6) becomes 


> [a,(r, x) sin (27 + 1 — — B,(r, x) cos (27 + 1 — 


t=0 


Hence finally, for r>0, 


sn, x = sina + [a(i, x) sin + 1 — 
(3.7) i=1 t=0 
— Bi, x) cos (2i + 1 — 


For a given 7, and ¢ an integer >0, sn,‘ x can be calculated directly from 
sn, x. Powers of sines and cosines are expressed as sums of sines or cosines of 
multiple angles. For =r =2 the result is, by (3.3), 

sn? x = 2-1(1 — cos 2x) + 2-4y(1 — cos 4% — 4~x sin 2x) 
(3.8) + 2-%y?[16 + (3 + 322?) cos 2x — 16 cos 4% — 3 cos 6% 


— 32x sin 4x — 40x sin 2x]. 


4. The polynomials cn, x. The preliminary expansion is 
s—1 (-- 1)*x* 
s=1 r=0 (2s)! 
go(s) = 1, 24qi(s) = 37* — 8s — 1, 
28g2(s) = 5%* — 8(s — 1)3** + 32s? — 48s — 9, 
= — 8(s — 2)5%* + 2(16s? — 60s + 41)3% 
— 41(256s? — 1248s? + 1280s + 297). 
Corresponding to (3.1), (3.5) we have 


24*g(s) = (2r + + On(s)(2r — + --- +(Q,,(s)1**, r>0, 
Qre(s) = guo(r)st + + 


the q:;(r) being rational numbers, and goo(r) = 1 by convention. 
Proceeding as in §3 we find 


= 
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cn, x = cos x + 2-4y(cos 3x — cos x + 4x sin x) 
(4.1) + 2-8y2[cos 5” + 8 cos 3x — (9 + 8x?) cos x 
+ 12x sin 3x + 16x sin x]. 
This may also be obtained (see §1) from cn? x-+sn? x~1; whence 
cn, x = (1 — snp? x)!/? = 1 — sno? x — —-- 
The value of cn:? x can be determined directly from (4.1), as for sn? x in §3. 


The result is (as it should be) 1—sn? x, where sn? x is given in (3.8). To indi- 
cate the general form corresponding to (3.6), we define the polynomials 7, 5: 


x) = 2 + 1 — 24)x), 


x) = Do +1—24)x), #20, 


j=0 
and find, for r>0, 


cn, x = cosx + >, [y,(i, x) sin (2i + 1 — 
i=1 


t=0 


(4.2) 
+ 6(i, x) cos (2i +1 — 2¢)x]. 
In getting (4.2) we have used the second of the following formulas, which 
follows from the first, 
en? «+ sn’?x*~ 1, = 0, r>0. 


t=0 
5. The polynomials dn, x. The preliminary expansion is 


dnx=1+ | h,(s) y* = 


22ho(s) = 27%, 28hy(s) = — 8s + 4), 
2 fro(s) = 22*[(32* — 4(2s — 3)27* + 32s? — 88s + 31], 
the general coefficient being of the form 
h,(s) = + 1)* + + 
Hy(s) = hio(r)st + +--+ + her); 


the h,;(r) are rational numbers, and by convention /oo(r) = 1. As before we get 


dng « = 1 — 2-*y(1 — cos 2x) 


5.1 
aioe + 2-%y2(— 5 + 4 cos 2x + cos 4x + 8x sin 2x). 
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The identity =1—ysn? x gives 
dn? x; 
hence in particular, by (3.8), 
(5.2) dng? « = 1 — 2-1y(1 — cos 2%) + 2-4y2(— 1 + cos 4a + 4x sin 2x). 
In the same way as before we find the general form, 
(5.3) dnox = 1, dni = 1+ 2-*y(— 1+ cos 22), 
with e, ¢ defined by 


t 


e(r, %) = + 1 — 


j=0 


t 
bir, x) = 
j=0 
and, for r>0, 
= 1 + 2-*y(— 1+ cos 2x) 
(5.4) + 2-4-2yitt [— heli) + x) sin (2(i + 1 — A) x) 


+ ¢:(i, x) cos (2(i + 1 — 


The approximations in this and preceding sections give approximations to 
the standard elliptic integrals. Thus from 


E(x) = [ane x dx 
and (5.2) we get : 
E,(x) = « — 2-'y(x% — sin x) — 2-*y?(4x — 4 sin 2x — sin 4a + 8x cos 2x) 
for the elliptic integral of the second kind. 
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PLANE PEANIAN CONTINUA WITH UNIQUE MAPS 
ON THE SPHERE AND IN THE PLANE* 


BY 
V. W. ADKISSON 


INTRODUCTION 


The plane peanian continuum M is said to have a unique map on a 
spherical surface (or a plane) if, and only if, for any topological image M’ 
of M on a sphere (or plane) S’ and any topological image M’’ of M ona 
sphere (or plane) S’’, every homeomorphism of M’ into M”’ can be extended 
to a homeomorphism of S’ into S”’. 

It is the purpose of this paper to characterize the plane peanian continua 
that have unique maps on the sphere or in the plane.f 

DerInitions. The simple closed curve J of a cyclicly connected con- 
tinuum C is called a bounding circuit of C provided that for any two maximal 
connected components H and K of C—J the sets H-J and K-J lie respec- 
tively on two distinct arcs AXB and AYB of J.§ 

A split circuit is a bounding circuit J such that C—J contains at least two 
components. 

If C is cyclicly connected, then C is triply connected if, and only if, it is 
impossible to express C as the sum of two closed connected sets A and B such 
that neither A nor B is an arc and the set A - B consists of two distinct points 
of C.|| 

Theorems IV and V state the principal results of this paper. 


THEOREM IV. The plane peanian continuum M has a unique map on the 
sphere if, and only if, one of the following conditions holds: 


* Presented to the Society, September 9, 1937; received by the editors July 22, 1937. 

+ A plane peanian continuum is a peanian continuum (continuous curve) that has a map 
(topological image) in the plane or on the sphere. For a characterization of these continua see W. S. 
Claytor, Topological immersion of peanian continua in a spherical surface, Annals of Mathematics, 
vol. 35 (1934), pp. 809-835. See also Claytor, Peanian continua not imbeddable in a spherical surface, 
ibid., vol. 38 (1937), pp. 631-646. 

t This problem was suggested by J. R. Kline. The author also wishes to express his appreciation 
of suggestions by Saunders MacLane which have led to improvements in this paper. 

§ For cyclicly connected continua bounding circuit is equivalent to boundary curve as defined 
by Claytor, loc. cit., first paper, p. 809. A simple closed curve J of M is called a boundary curve of M 
provided that there do not exist in M—J distinct components H and K such that (1) a point pair of 
H -J separates a point pair of K-J on J, or (2) H: J =K-J=three distinct points. Both definitions 
will be found useful in later proofs. 

|| Cf. definition of triply connected graphs, H. Whitney, Congruent graphs and the connectivity 
of graphs, American Journal of Mathematics, vol. 54 (1932), p. 158. 
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(1) M is acyclic and consists of either a simple arc or a triod.* 

(2) M contains one cyclic element Ct which is a maximal triply connected 
cyclic curve of M, and M—C consists of at most a countable number of arcs, 
1, G2, a3, - , such that d;-d;=0, (t%7), and each a;-Cis a single point which 
lies on only one bounding circuit of C,t provided that if C is a simple closed 
curve, then M —C is at most a simple arc. 


THEOREM V. The plane bounded peanian continuum M has a unique map 
in the plane if, and only if, M is one of the following curves: 

(1) asimple arc, 

(2) atriod, 

(3) a simple closed curve, 

(4) a curve M such that M contains a closed 2-cell C and M —C consists of 
at most a countable number of arcs a1, dz, a3, - , such that &;-4;=0, (ij), 
and each a;-C is a single point which lies on the only bounding circuit of C.§ 


PRELIMINARY THEOREMS 


TuHeEoREM I. The plane cyclic (cyclicly connected) peanian continuum C is 
triply connected if, and only if, C contains no split circuit. 


THEOREM II. The plane cyclic peanian continuum C has a unique map on the 
sphere if, and only if, C is triply connected. 


Corotiary 1. Jf J is a bounding circuit of C which is not a split circuit, 
then in every map of C on the sphere (or plane) the image of J is the boundary of 
a complementary domain of the map of C. 


Coro.iary 2. If J is a split circuit of C there is some map of C on the sphere 
(or plane) in which the image of J is the boundary of a complementary domain 
of the map of C.|| 

DeriniTion. The point p of M is a split-point of M, if and only if, M can 
be expressed as the sum of two closed connected sets A and B such that 
A-B=p, and such that if A or B is an arc, then # is not an end point of that 
arc. 


THEOREM III. /f the plane peanian continuum M has a unique map on the 
sphere or in the plane, then M does not contain a split-point. 


* Three arcs PA, PB, PC with P, and only P, common to any two. 

¢ G. T. Whyburn, Concerning the structure of a continuous curve, American Journal of Mathe- 
matics, vol. 50 (1928), p. 168. 

} Each point d;-C is a non-local cut-point of C. See Theorem VI and the alternative statement 
of Theorem IV at the end of this paper. 

§ See Claytor. loc. cit., p. 810, Corollary (C). 

|| This is a generalization of Proposition K, Claytor, loc. cit., first paper, p. 828. 
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PROOFS OF THEOREMS I-V 


Proof of Theorem I. Let C be a map of the given curve on a plane S, and 
suppose C not triply connected. There exist two points p and g of C such that 
C=A-+B and A:-B=p-+4q, where A and B are closed connected sets neither 
of which is an arc. Let r bea point of A —p—gand sa point of B—p—q. Then 
on any arc rxs lying in S—p—g there is a last point of A from r to s anda 
first point of B. Let these points be r and s respectively. The open arc (rxs) 
lies in a domain R complementary to C with boundary J >r+s. Hence this 
circuit J must pass through # and gq. But since neither A nor B is an arc there 
must exist at least two components of C—J; one in A—A-J and one in 
B—B-J. Let N; and Nz be any two components of C—J. Then two points 
of cannot separate two points of J on J. For N, and Nz are connected 
sets lying in S—R and cannot intersect. If V,-J =N2-J =three points, then 
N, and Nz must intersect; this is impossible. Therefore, since C—J contains 
at least two components, J is a split circuit. 

Now assume that C contains a split circuit J. There is a component J, of 
C—J with limit points all on the arc gxh of J and a component N-» with limit 
points all on the arc gyh=J—<(gxh). The arc gxh may be chosen so that g 
and h are points of V,-J. Now every component of C—J must have its limit 
points in either gxh or gyh. For suppose some component N of C—J has a 
limit point d in (gxh) and a limit point e in (gyh). Every arc of J containing d 
and e will necessarily contain either g or h, limit points of N, which separate d 
and e on J. But this contradicts the fact that J is a split circuit. 

Now let A = gxh+ J, plus all components of C —J (different from N2) with 
limit points on gxh, and let B=gyh+N, plus all components of C —J not in- 
cluded in A. Then C=A+B, where A and B are closed connected subsets of 
C; neither A nor B is a simple arc; and A: B=g-+h. Therefore C is not triply 
connected. This completes the proof of Theorem I. 

Proof of Theorem II. Suppose C is triply connected. Then every bounding 
circuit of C has an image which is a c.d.b.* In every map of C. For let C’ 
be any map of C on a sphere S’, and J’ any bounding circuit of C’. If C’ does 
not consist of a simple closed curve, then there is one, and only one, com- 
ponent of C’—J’ (Theorem I), and this component must lie entirely in one 
of the regions of S’ bounded by J’. The other region of S’ bounded by J’ 
must be a complementary domain of C’. Obviously every c.d.b. of C’ is also 
a bounding circuit of C. 

Therefore if C’ is a map of C on a sphere S’ and C’’ a map of C on S”, 
every homeomorphism of C’ into C’’ must preserve complementary domain 


* “Complementary domain boundary” will be abbreviated “c.d.b.” 
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boundaries and is extendable to S’ and S’’.* Hence C has a unique map on 
the sphere. 

Suppose C is not triply connected. Then C must contain a split circuit J 
(Theorem I). We shall show first that there is a map of C in which the map of 
J isac.d.b. Let C be any map of C on a sphere S and suppose J is not a c.d.b. 
of C. Let R, and R: be the two regions of S bounded by J. Since J is not a 
c.d.b. of C there must exist a component N, of C—J in R; and a component 
Nz in Rz. Any two points p and g of N;-J lie on some c.d.b. of C within R:. 
For if every arc from p to g lying in R; contains a point of C, there would 
then exist a connected subset of C lying in R, with limit points on J that 
separate p and g.{ But such a connected subset would belong to a component 
of C—J, and J would not be a split circuit. Hence p and gq lie on some c.d.b. 
within R,. Furthermore any three points , g, r of Ni-J lie on the same c.d.b. 
of C within R:. To show this suppose 9, q, r do not lie on the same c.d.b. within 
R:. Then there are three complementary domains of C in R, each containing 
a pair of p, g, r on its boundary J;, (¢=1, 2, 3). Let 1, m, m be three arcs, one 
from each of the boundaries J; such that 1+m+n> p+ q9+r and bounds a 
region R; which is a subset of R; containing no points of Ji +J2+J;. This is 
possible since no two arcs of 1, m, nm can have a common point. For if two arcs, 
l, m do have a common point (other than an end point) there would exist a 
component N;>(/+m) of C—J such that 3-J contains p, g, r, and since 
M-J contains p, g, r, the circuit J would not be a split circuit (definition). 
Now any two points of , g, r lie together on the same c.d.b. of C within R;. 
For if every arc from # to q (or to r) lying in R; contained a point of C, there 
would exist, as above, a connected subset of C lying in R; and having points 
on /+m-+n that separate p and g on /+m-+n. There would then exist a 
component N, of C—J such that N,-J contains p,q, and 1; but this is impos- 
sible. Now let the circuit J; above be the boundary of a domain r; complemen- 
tary to C and lying in R:. Let / be the arc (of the three /, m, m) which lies on 
J,>p+g. Then since R; contains no points of J: +J2+J3, the domain 7; is a 
subset of Re, but not of R;, and has p and q on its boundary. Let r2 be a domain 
complementary to C which is a subset of R; and has p and g on its boundary. 
Now R; is not a complementary domain of C since we are assuming that p, g, r 


* V. W. Adkisson, Cyclicly connected continuous curves whose complementary domain boundaries 
are homeomor phic, preserving branch points, Comptes Rendus des Séances de la Société des Sciences 
et des Lettres de Varsovie, Class 3, vol. 23 (1930), p. 167, Theorem 2. If M is a cyclicly connected 
continuous curve lying on a sphere S, and T is a continuous (1-1) correspondence such that 
T(M)=M, a necessary and sufficient condition that T be extendable to S is that for every boundary 
J of a complementary domain of M, T(J) be also the boundary of a complementary domain of M. 

t C. Kuratowski, Sur le probléme des courbes gauches en topologie, Fundamenta Mathematicae, 
vol. 15 (1930), p. 274, Lemma ITI’. 
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do not all lie on the same c.d.b. in R;. The above process can then be repeated 
and a third complementary domain 7; obtained which is different from 7; and 
r. but also has p and g on its boundary. This process may be continued in- 
definitely. Hence there exists an infinite number of domains 1, fe, 73, - 
each complementary to C and having p and g on its boundary. But this is im- 
possible since there is at most a finite number of complementary domains of C 
of diameter greater than any e>0,* and the diameter of each r; is equal to or 
greater than the distance between p and g. We conclude that there is a c.d.b. 
L within R; containing , q, r. 

Now any fourth point s of ¥:-J must also lie on L. For s and one of the 
three points p, g, r, say r, must separate the other two, p and gq, on J. Since 
rand s must lie on the same c.d.b. of C in R; there exists an arc (rs) in Re such 
that (rs)-C =0, and in like manner an arc (pq) in R2 such that (pg)-C =0. But 
(rs) and (pq) must then have a common point and hence lie in a common 
complementary domain of C. Therefore ~, g, r, and s all lie on L, and all 
points of ¥i-J must lie on L. 

Let D; be the complementary domain of C bounded by L, and let (NV) 
represent the set of all components N; of C—J in R, for which V;-J =N;- L. 
Let (N’) be a topological image of (N) in D,; such that C—(N)+(N’) is a 
map of C. This is always possible since obviously it would be possible if C 
were mapped so that L is a circle. This process can be repeated on the map 
C—(N)+(N’) so that finally a map C’ is obtained in which R;, is a comple- 
mentary domain of C’. It follows by well known methods in analysis situs 
that C’ is a topological map of C. For if the above process is necessary an 
infinite number of times, it involves complementary domains D,, De, Ds, - - - 
of C of which only a finite number are of diameter greater than any e>0. 

By the same method as above it is possible to map C so that the image of 
the split circuit J is not a c.d.b. of the image of C since there are at least two 
components of C—J. 

Now let C’ be a map of C on S’ in which the image of J is a c.d.b. of C’ 
and C’’ a map on S”’ in which the image of J is not a c.d.b. of C’’. Then every 
homeomorphism of C’ into C’’ is not extendable to S’ and S”’ since comple- 
mentary domain boundaries are not preserved. Hence C does not have a 
unique map, and the theorem is proved. 

Corollaries 1 and 2 follow directly from the preceding proof. 

* Schoenflies proved (1908) that the complementary domains of a continuous curve are count- 
able, and that at most a finite number have diameters greater than any e>0. See R. L. Moore, 
Report on continuous curves from the viewpoint of analysis situs, Bulletin of the American Mathe- 
matical Society, vol. 29 (1923), pp. 290, 295. 


+ The components N; are countable. See R. L. Wilder, Concerning continuous curves, Fundamenta 
Mathematicae, vol. 7 (1925), p. 360, Theorem 9. 
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Proof of Theorem III. We shall assume that M has a split point and at 
the same time a unique map and show that this leads to a contradiction. 

Let M be a map on the sphere S, p a split point of M, and A and B two 
closed connected subsets of M such that M=A+B, A-B=p, and such that 
if A or B is an arc, p is not an end point of this arc. Let R be a complementary 
domain of M whose boundary contains points of both A and B. Let x«¥p 
be a point of A in the boundary of R and z¥ a point of B in the boundary 
of R. Let (xyz) be an arc in R, xpz an arc of M, A Darc xp, and B2arc pz. 
Let A; be a connected component of A—xp such that A; has a point r on 
xp—x, and B, a connected component of B—pz such that B, has a point s 
on pz—z. Since neither A nor B is a simple arc with end point at 9, it is possi- 
ble to choose x and z so that this latter condition may be satisfied. There are 
two cases to consider. 

Case 1. r~s. Let M’ represent a map of M on the sphere S’. Let T be a 
homeomorphism such that T7(M)=M’ and 7(S) =S’. This is possible since 
we assume that M has a unique map. Throughout this proof a primed set will 
indicate the topological image under T of the unprimed set. Now either the 
set A, lies in the region D, of S bounded by the simple closed curve C = xpzyx 
while B, lies in the other region Dz bounded by C, or A; and B, lie in the same 
region, say D,. We shall assume the latter, but the proof is practically the 
same in either case. 

Since T is extendable to S and S’, the sets Aj and B/ lie in D/. We now 
construct a new map M’’of M on S’ as follows: Let H/ , (i =1,2), be the subset 
of A’—arc x’p’ that lies on D/. The set Hj includes Aj. Let Hi’ be a topo- 
logical image of lying in D/ —B’- Dj, and Hz’ a topological image of 
lying in D{ —B’- Dy such that x’p’+Hy{' is a topological image of A’. 
Then =x'p’+Hi’ +H’ +B’. Let U be a homeomorphism such that 
U(M)=M" and U(S) =S’. Let U(xyz) =x’y’’z’, U(H;) =H)’, and for points 
in B+xpz let U =T. Since U is extendable, U(D:) = Dj’ is a region of S’ con- 
taining U(M-D,). Let f be any arc of S’ from a point of A’ (the topological 
image of Ay in H{’) to a point of By which lies entirely in Dj’ including end 
points. Such an arc must intersect C’, the boundary of Dj, since By lies in 
Dj and Aj’ lies outside D/ ; and since x’p’z’ is in the boundary of Dj’ the 
arc f must intersect (x’y’s’). Then any arc g from 7’ to s’ lying, except for 
end points, in D{’ must intersect (x’y’s’). To show this let d and e be regions 
about 7’ and s’ respectively, that do not contain points of «’y’z’. It is possible 
to obtain an arc ¢ that lies entirely in d and joins a point of g—r’ to a point 
of A{’. In like manner we obtain an arc « in e joining a point of g—s’ toa 
point of B/. The arcs ¢ and wu plus the proper subset of g then yield an arc h 
from A{’ to By lying entirely in D/’. But 4 must then intersect («’y’s’), and 
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since neither ¢ nor u can intersect (x’y’s’) the arc g must have a point in com- 
mon with (x’y’z’). Hence there must be a connected subset of x’y’s’ lying in 
D{' with two points on C’’ =x’y’’z’p’x’ (the boundary of D{’ ) that separate r’ 
and s’ on C’’.* One of these points must obviously lie on the arc r’s’ of C’”’ 
where r’s’ c (x’p’z’). But this is impossible since («’p’z’) and (x«’y’z’) have 
no common points. Therefore the assumption that every homeomorphism of 
M is extendable has led to a contradiction, and we conclude that M has no 
unique map. 

Case 2. r=s=p. We use the same notation as in Case 1 and obtain in the 
same manner the map M”’. Any arc joining a point of A/’ to a point of B/ 
and lying entirely in D/’ must intersect (x’y’z’). Let d be a region about p’ 
that contains no points of x’y’s’. Let f be an arc from Aj’ to By lying in Dy’ 
and also in d. Then f cannot intersect (x’y’s’) but must intersect (x’y’z’). 
This contradiction shows that M has no unique map and completes the proof 
of Theorem III. 

Proof of Theorem IV. First, assume that M has a unique map. If M is 
acyclic it must consist of either a simple continuous arc or a single triod since 
M contains no split point (Theorem III). Obviously these are the only two 
acyclic peanian continua without split-points. 

If M is not acyclic it contains at least one cyclic element C which is a 
maximal cyclic curve of M. Since M cannot contain a split-point there is only 
one such cyclic element C. For if there were a second cyclic element C; which 
is a maximal cyclic curve of M,there would exist a cut-point p of M separating 
C and C,,f and obviously » would also be a split-point. 

If M —C contains a maximal connected acyclic subset with a branch-point 
p, or if M—C contains two arcs with a common end point p on C, then # is 
in either case a split-point. Therefore, M—C contains at most a countable 
number of simple arcst with distinct end points on C. 

We shall now assume that C is not triply connected and obtain a contra- 
diction of the assumption that M has a unique map. If C is not triply con- 
nected, C contains a split circuit J (Theorem I). Let M be a map of M on 
the sphere S and assume that the components N, and N2 of C—J lie in the 
regions R; and R: of S bounded by J. There exists a c.d.b. L of C lying in R; 
such that V,-J =W,-L (this was shown in the proof of Theorem II). Let D 
be the complementary domain of C bounded by L. Let }>a; be the set of arcs 
of M lying in D with end points in V,-L and }>°); the set of arcs of M-D 
not included in }°a;. We now obtain a new map M’ of M on S as follows: Let 

* Kuratowski, loc. cit., Lemma III’. 


t See G. T. Whyburn, loc. cit., p. 168. 
1 R. L. Wilder, loc. cit., p. 360, Theorem 9. 


1938] PLANE PEANIAN CONTINUA 65 


(K) represent all arcs of M—C with end points on M, and let Ni be amap 
in D of Ni+(K) such that V,-L=N/-L and C—N,+N is a topological 
image of C+(K). Let D, be the complementary domain of M—[Ni+(K)] 
that contains N,+(K), and let >-a/ be a topological image of }-a; in D, such 
that each arc @/ has one end point on L, and the set of all these end points of 
>-4! is identical with the set of end points of }-d; on L. For each b; we obtain 
a new arc b/ as follows: If 5;-N{ =0, then b/ =b,;. If 5;-Ni{ 0, let d; be a 
region about Q; (the end point of }; on L) containing no point of N/. Then 6} 
is taken as a simple arc which is a subset of b; and lies in d; with end point 
Q;. Then M’=M—N,—(K)+N{ is a topological 
image of M. But every homeomorphism of M into M’ cannot be extended to S 
since L is a c.d.b. of M but not ac.d.b. of M’. Hence M does not have a unique 
map, and this contradiction leads to the conclusion that C must be triply con- 
nected. 

Now suppose C does not consist of a simple closed curve, and there is an 
arc b in M—C with end point p on two bounding circuits, J and L, of C. 
Let (M—b) be any map of (M—5) on S. Then J and L bound two comple- 
mentary domains of C and are outer boundaries of two complementary do- 
mains D, and Dz of M—b. Let M’ be a map of M obtained by mapping b 
in D, with end point at p, and M”’ a map of M obtained by mapping d in D, 
with end point at ». We have now two essentially different maps of M. Hence 
the end points of the arcs in M—C that lie in C must lie in one, and only one, 
bounding circuit of C. 

If C consists of a simple closed curve, M —C cannot contain more than one 
arc. The proof of this statement is not difficult and will be omitted. __ 

If M consists of a simple continuous arc, the sufficiency of condition (1) 
follows from the fact that any homeomorphism between two arcs is extend- 
able to a homeomorphism of their planes.* If M consists of a simple closed 
curve plus an arc, the proof that M has a unique map is easily obtained from 
the Schoenflies theorem that any homeomorphism between two simple closed 
curves can be extended to a homeomorphism of their planes. 

The proof that a triod has a unique map may also be obtained by a simple 
application of the Schoenflies theorem. 

Suppose C is not a simple closed curve. Let M be a map of M on S, and M’ 
a map on S’. The map of C is unique (Theorem II), and from Corollary 1 we 
see that a c.d.b. of C in one map has an image which is a c.d.b. in every map 
of C. Let >a; be the arcs of M—C with end points on the same bounding cir- 

* R. L. Moore, Conditions under which one of two given closed linear point sets may be thrown into 


the other one by a continuous transformation of a plane into itself, American Journal of Mathematics, 
vol. 48 (1926), p. 67. 
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cuit J of C. Then in the map of C on S these arcs lie in the same complemen- 
tary domain D of C bounded by J, and the map )-a/ of ><a; on S’ must lie 
in the complementary domain D’ of C’ bounded by J’ (the image of J) since 
the end points of }>4@/ lie on J’ and on no other bounding circuit of C’. 

Now any homeomorphism 7(M)=M’ can be extended to D and D’. Let 
pq be one of the arcs d; in D with end point p on J such that either gq or its 
image p’g’ on S’ is of diameter greater than some e>0. Let (gxy) be an arc in 
D—)><a; where y is a point of J but not an end point of some arc a; in D. 
Let (g’x’y’) be an arc in D’—}°a/ where y’ =T(y). Let 7 and s be points of J 
that separate p and y, and let r’=7(r) and s’=7(s). Let a; be any arc of )-a; 
that lies in the subset d of D bounded by pgxyrp. Then a; must lie in the sub- 
set d’ of D’ bounded by p’q’x’y'r'p’. For if az lies in D’—d’, it must have an 
end point at either p’ or y’. But this is impossible since y was selected not to 
be an end point, and p cannot be an end point common to two arcs in D. 
Hence if >°b; is the subset of >°a; lying in d, then T(>°),) is the subset of 
><a! lying in d’. The same would be true of any similar subdivision of D 
and D’. In fact sides are preserved under T as used by Gehman.* Therefore 
T can be extended to D and D’, and in like manner to each complementary 
domain of C and C’, and finally to S and S’. The proof would necessarily be a 
partial duplication of Gehman’s proof and is omitted. This completes the 
proof of Theorem IV. 

Proof of Theorem V. Cases (1), (2), (3) of Theorem V follow easily from 
Theorem IV. 

If M has a unique map and is not acyclic or not a simple closed curve, 
Theorem III shows that M contains one, and only one, cyclic element C 
which is a maximal connected cyclic curve of M. Furthermore C cannot con- 
tain two distinct bounding circuits or consist of a single circuit which would 
be the outer boundary of two complementary domains of M in any map. For 
if this were true there would be a circuit J which would be the outer boundary 
of a bounded complementary domain R of M in some map on a plane S 
(Corollary 1). Now let M’ be a map of M ona plane S’ in which the image of 
the boundary of R is the boundary of the unbounded complementary domain 
R’ of M’. Then obviously any homeomorphism of M into M’ which carries 
the boundary of R into the boundary of R’ cannot be extended to R and R’. 
Therefore C contains only one bounding circuit, and Claytor’s result (Corol- 
lary (C), p. 810) shows that C consists of a closed 2-cell. 

Since M cannot contain a split point, M—C can consist of at most a count- 


* H. M. Gehman, On extending a continuous (1-1) correspondence of two plane continuous curves 
to a correspondence of their planes, these Transactions, vol. 28 (1926), proof of Theorem I, pp. 256-260. 
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able number of distinct arcs with distinct end points on the bounding circuit 
of C. 

Now suppose M consists of the curve (4) in the theorem. The arcs M—C 
must all lie in the unbounded complementary domain of C in any map of M 
in the plane. Let D and D’ be the two unbounded complementary domains of 
two maps of M and M’ respectively. Then any homeomorphism of M into M’ 
can be extended to D and D’ as indicated in the last paragraph of the proof 
of Theorem IV. 

A point p of a peanian continuum M is said to be a local cut-point of M if, 
and only if, p is a cut-point of some connected open subset of M.* 

The following theorem will be stated without proof: 


THEOREM VI. If M is a peanian continuum in a plane S, any non-cut-point 
p of M lies on two (or more) complementary domain boundaries of M if, and 
only if, p is a local cut-point of M.t 


Now if M is a plane peanian continuum and C a maximal cyclic curve of 
M, every point in C-(M—C) must lie on a bounding circuit of C. Since C 
contains no cut-point (of C) we can make the following alternative statement 
of Theorem IV: 


The plane peanian continuum M has a unique map on the sphere if, and only 
if, one of the following conditions holds: 

(1) M is acyclic and consists of either a simple arc or a triod, 

(2) M contains one cyclic element C which is a maximal triply connected 
cyclic curve of M, and M—C consists of at most a countable number of arcs, 
a1, G2, d3, , such that d;-4;=0, (i+), and each is a non-local separating 
point of C, provided that if Cis a simple closed curve, then M —C is at most a 
simple arc. 


* See G. T. Whyburn, Local separating points of continua, Monatshefte fiir Mathematik und 
Physik, vol. 36 (1929), pp. 305-314. 

{ This theorem is well known to topologists although the author has been unable to find it stated 
explicitly in any published paper. See, however, G. T. Whyburn, Local separating points of continua, 
loc. cit., Theorem 6, and G. T. Whyburn, Concerning points of continuous curves defined by certain im 
kleinen properties, Mathematische Annalen, vol. 102 (1929), pp. 313-336, Theorem 31. 
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THE LAW OF APPARITION OF PRIMES IN A 
LUCASIAN SEQUENCE* 


BY 
MORGAN WARD 


I. INTRODUCTION 
1. We call a sequence of rational integers 
(u): Uo, U1, U2,°** Un, * 


Lucasian (Ward [1]f) if it satisfies a linear recursion relation with constant 
integral coefficients, and if u, divides u,, whenever » divides m. The adjective 
“Lucasian” is chosen in honor of the French mathematician Eduard Lucas 
who first developed a theory of these sequencest (Lucas [1], [2]). We are 
concerned here with the fundamental problem of determining a priori all the 
terms of such a sequence divisible by any preassigned modulus m. 

Call the suffix k of a term x, of (u) divisible by m a place of apparition of m 
in (uw), and let S,, denote the set of all places of apparition of m. It follows 
from the results established in Ward [1] that the set S,, consists in general§ 
of all multiples of a finite number of places of apparition pu, p2, - - - , p, called 
the ranks of apparition of m in (u) with the defining properties 


u, = 0 (mod m), u, # 0 (mod m) if s divides p. 


The least common multiple}| p = [p1, p2, - - - , o. | of the ranks of apparition 
of m in (u) is called simply the rank of m in (u). The places of apparition of m 
in (uw) are periodic modulo p, and p divides the restricted period] of (zx) 
modulo m. Furthermore if m=a-b where a and 6 are co-prime, then the set 
S, of places of apparition of m is the cross cut of the sets S, and S,, and 
each rank of apparition of m is the least common multiple of ranks of appari- 
tion of a and b. 

Our fundamental problem reduces then to determining the ranks of ap- 


* Presented to the Society, February 26, 1938; received by the editors July 13, 1937. 

+ The numbers [1], [2], - - - refer to the bibliography at the close of the paper. 

t Lucas confined himself in the main to the case when the recursion relation is of order two. 

§ An exception occurs only if m divides every term of (u) beyond a certain point. 

|| We use [a, b, - - - ] and (a, b, - - - ) to denote the least common multiple and greatest common 
divisor of the integers a, }, - - - 

{| The restricted period of (u) modulo m is the least positive integer u such that unip=atUn 
(mod m) for all large n, where a is a constant integer. For the terminology of the theory of recurring 
series which we employ, see Ward [2]. 
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parition of primes and powers of primes in (uz). In the terminology of Lucas,* 
we must discover the “law of apparition” of primes in (w), and the “law of 
repetition” of primes in (). I shall confine myself here to the first problem; 
the modulus m will invariably be a prime number p. 

2. It will be well at this point to exhibit some Lucasian sequences. Let 


f(x) = x* — — g(x) = 


be two polynomials with rational integral coefficients c:, - - - , d:. For sim- 
plicity of exposition we assume that f(x) and g(x) have non-vanishing dis- 
criminants and resultant.f Let ai, - - - , ax; - - - , 8: denote the roots of 
f(x) =0 and g(x) =0 respectively. Then none of the k(2/++k—1)/2 differences 
a;—B;, a;—a,, vanish. 


Consider now the sequences (U): Uo, Ui, - - - ; (R): Ro, Ri, - - - , where 
av’ — at — B 
U,= U,(f) = (=—~), R, = R,(/, g) (=—*). 
i<r \ Qj — Gp a; — B; 


Then U, and R, are rational integers, and both sequences are clearly divisi- 
bility sequences. Both sequences are also linear (Ward [1]). Hence, both se- 
quences are Lucasian. The sequence U,, for k=2 is the classical Lucas func- 
tion (Lucas [1]), while R, for g(x)=x—1 is equivalent to the function 
studied by T. A. Pierce [1], P. Poulet [1], and D. H. Lehmer [2]. 

We shall call the polynomials f(x) and g(x) the generators of (R) and (U). 
We refer to both types of sequences as R-sequences. 

The determination of the law of apparition for R-sequences is of particu- 
lar importance because it appears probable that al/ Lucasian sequences may 
be exhibited as R-sequences or divisors of R-sequences.§ (See next section.) 
I shall show here in detail that the determination of the law of apparition 


* See Lucas [1], pp. 209, 289, 294, or Lehmer [1], pp. 421, 422. 

t This restriction is removed in the body of the paper. 

t It is possible to exhibit both (R) sequences and (U) sequences as Pierce sequences. For if we 
let then (a"—£")/(a—B) Accordingly if we denote the products 
in any order by a1, e, , ext, then 


kt fe"—1 
= (- 1) TT ( ) 


and (¢,"—1)(e"—1) - - - (exi—1) is the function studied by Pierce in the paper cited. 

A similar result holds for (U). Since we must then deduce the properties of (R) from a polynomial 
(x—«) - + - (x— ex) of higher degree than f(x) or g(x) with non-integral coefficients whose fac- 
torization depends in a highly complicated manner upon f(x) and g(x), the reduction appears to be of 
only formal interest. 

§ With the qualifications described in §3, I have found empirically no Lucasian sequences which 
are not R-sequences. 
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depends upon the fundamental problem of determining the period of a mark 
in a finite field. My results are sufficiently precise to give a good deal of spe- 
cific information about the terms divisible by a given prime in any numerical 
example of an R-sequence. 

The sequence (U) is also of importance because of the following theorem: 


THEOREM 2.1. Let the Lucasian sequence (u) belong to the polynomial f(x), 
and let p be any prime which does not divide the discriminant of f(x). Then 
every place of apparition of p in (u) is a place of apparition of p in the Lucasian 
sequence (U) generated by f(x). 

3. Another extensive class of Lucasian sequences arises as follows. Con- 
sider for simplicity a sequence (U) with an irreducible generator f(x). The 
galois group of f(x) may be represented as a transitive permutation group 
upon the roots {ai}, {az},---, fas}. 

Now let us represent the group as a permutation group upon the k(k—1)/2 
pairs of roots {on, a}, { an, as}, { a,}. 

If the group is singly transitive over the {a;}, the pairs {a;, a;} may be 
separated into x22 transitive sets 


{ a;'}, {ai,, a;',}, { a,,} 


i=1,2,---, Kj sitset + 5x = R(k — 1)/2. 


We have a corresponding arithmetical factorization of the general term 


U,, of (U) into a product of x rational integers: 


dnt — 


Each of the x sequences (U“) is obviously Lucasian.* 
We shall refer to sequences obtained in this manner as divisors of R-se- 
quences. The determination of the laws of apparition of primes in divisors of 


* For example, suppose that k=4 and that f(x) =24—¢x3—cox?—¢3x—c4=x°+(20— R)x?+Q? 
whereQ and R are co-prime integers and R is not a square. Then with a proper notation, (x—a)(x— az) 
=x?— R'/*x+Q, a3=— a1, a4= —ay. There are two transitive sets of the {a;, a;}; namely, {a, az}, 
aa} as} { as, as} and as} as} . 


We find that U,=U,U,,@ where 
a" — a" \?/ — (—a2)” 
U, = = = (40) "71, 


— a + ae 
Now (a:"— a") /(a1— a2) is one of the important functions introduced by D. H. Lehmer in his doctor’s 
thesis (Lehmer [1]), and [a:"—(—a2)"]/(a:-+e2) is immediately expressible in terms of Lehmer’s 
U, and Vp. 
The function N(a"—§") studied by Marshall Hall (Hall [2]) may be similarly exhibited as a 
divisor of a certain R-sequence. 
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R-sequences is an important part of our general problem. But to avoid 
stretching the present paper to an inordinate length, we shall give our in- 
vestigations elsewhere. The problem amounts to correlating the results ob- 
tained in this paper by the use of Schatanovski’s principle (§7) with results 
obtained from the Dedekind-Hilbert theory of the ideals of a galois field. 

4. The law of apparition of primes in R-sequences is determined as fol- 
lows. Consider first the sequence (R). We show (§§6, 7) that it suffices to 
consider primes which do not divide the resultant of the generators of (R). 
We have decompositions of f(x) and g(x) modulo of the form 


where the polynomials f; and g; are primary, irreducible and co-prime in pairs 
modulo p. We show in §8 that we have a corresponding decomposition of the 
general term of (R) modulo p 
Rf, g) II £i) (mod p). 

In the terminology of Ward [1], the sequence (R) factors modulo # into 
a product of simpler sequences; for the f; and g; are irreducible modulo p. 
But then (Ward [1]) the set S, of places of apparition of p in (R) is the union 
of the sets of places of apparition of p in the sequences (R(f;, g;)). Therefore 
in discussing the law of apparition of primes in (R) we may assume that the 
generators of (R) are irreducible modulo p. A like simplification holds for the 
sequence (U) (§9). 

5. If the generator of (U) is irreducible modulo #, the law of apparition 
of p in (U) takes the following beautifully simple form, affording a far-reach- 
ing generalization of the classical results of Lucas [1]: 


THEOREM 5.1. Let f(x) be irreducible modulo p, and let d be its period* 
modulo p. Let k=qi°1g2% - - - Qx°x be the decomposition of its degree k into prime 
factors. Let p(s) be defined for any positive integer s as the residual} of p*—1 
with respect to d; that is, the quotient of d by the greatest common divisor of \ and 
p*—1. Then the ranks of apparition of p in (U) occur amongt the x numbers 
p(k/q:), - p(k/gx), the rank of p in (U) divides p(k/qige - - and p 
has at most x ranks of apparition. 


We observe that the numbers p(k/g) are known as soon as the period is 
known. 


* The period of f(x) modulo # is by definition the smallest positive value of \ such that x*=1 
(modd , f(x)). 

t The operation of residuation has important arithmetical applications. I have developed some 
of these in the paper, Ward [3], which arose out of the present investigations. 

t We must exclude from the set of p(&/g) any element which is a multiple of any other. 
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Unlike the ranks of apparition of p in (U), the ranks of apparition of p 
in (R) are not obtainable from the periods of the generators /(x) and g(x) of 
(R) alone when the generators are irreducible modulo p. If f(a) =0, g(8) =0, 
the ranks of apparition occur among the / periods a, o2, - - - , ¢; modulo p 
of the algebraic numbers af-!, aB-?, - - - ,aB-»'* in the galois field of the 
roots of the generators ($11). In §14, we assign upper and lower limits to the 
periods o in terms of the periods and restricted periods of f(x) and g(x). 

The least common multiples of pairs of the periods o have the following 
remarkable property ($13): 

Here m is the least common multiple of the degrees of f(x) and g(x) and we 
adopt the convention that ¢,=0, if x=y (mod J). 

It appears unlikely that results of simplicity comparable to Theorem 5.1 

exist for the law of apparition of primes in (R). 


II. REDUCTION TO R-SEQUENCES WITH IRREDUCIBLE GENERATORS 


6. This section is devoted to some algebraic preliminaries. Let <x; 
Vi, Yo, * 21, 22, * 2: be R+1+1 indeterminates, and let Y;, — Yo, 
21, —Z2,---, (—1)''Z, be the k+/ elementary sym- 
metric functions of the indeterminantes y, z defined by* 


(x — yi)(x — = — Vi, 


(x — 21)(% — 22) — = — Zyxt —--- 


By the fundamental theorem on symmetric functions, the polynomials 


n n k n n 
tml \ Yi — 2; \ Yi — Yi 
i<j 
may be expressed as polynomials in the Y and Z with integral coefficients; 
we write 


(6.2) Ox(y, 2) = Z), Vi(y) = Q.(Y). 


Suppose now that f2, - - , are m new indeterminates where k=>m21, 
l=>m2=1, and consider the effect of substituting 4 for y, and 2), é2 for y,-1 and 
21-1, fs for y,-2 and 2-2, and so on, in the identity (6.2). If we let 


* Minus signs are introduced so that the associated difference equation used later 
* * may have all its signs positive. 
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then the polynomial P,.,, on the right of (6.2) is transformed into a polynomial 
Pim in the arguments Y’, Z’, T’ with integral coefficients. Its expression 
in terms of y, z, ¢ is easily found to be 


m —Ox—m,t—m(¥, 2)Ox—m,m(Y; 
Hence by (6.2) 


Pirm(Y’, 2’, T’) 


(6.3) 


Now let R, Z), Z’, T’), and consider 
the sequences 
(R): Ro, Ri, Ro, 
(U): Uo, Ui, U2, 
(R*): Re, Rf, 
THEOREM 6.1. (R), (U), and (R*) are Lucasian in the rings formed by ad- 
joining respectively Y, Z; VY; Y', Z', T’ to the ring of rational integers. 


Proof. The sequences evidently lie in the specified rings. Consider (R). 
Since its general term is a product of cyclotomic functions (y"—2")/(y—z) 
having the divisibility property, (R) has the same property; that is, R, di- 
vides R,, if m divides m, and the division may be performed in the ring of Y 
and Z. The linearity of (R) over the ring follows from a general theorem in 
Ward [1]. Hence (R) is Lucasian. Similarly (U) is Lucasian. Then (R*) as a 
product of the seven Lucasian sequences with general terms n”, T/-!, 
Pann Z'), PraalY’, Z'), QOn(T"’), Q0n(T’), and is 
also Lucasian (Ward [1]). 

7. We now consider the sequence (R) of §2 of the introduction. Let ® 
denote the ring of rational integers, and let 


be two polynomials with fixed rational integral coefficients. Let a1, - - - , ax; 
Bi, -- - , 8: be their roots, D; and D, their discriminants, and 


Ryo = + J] (a — 
their resultant. If R;,, does not vanish, we define a sequence 
(R): Ro, Ri, Ro, 


in the notation of §6 by R, (a, 8) = (c,d). 
If R;,, vanishes, then 


| 
4 
{ 
i 
| 
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(7.1) = f'(x)h'(x), g(x) = g(x) h'(x), 
where 
@ — --- — chia, 
(7.2) = xh ™ — — — din, 


are polynomials in Rt and Ry-,,,~0. Deviating for simplicity from the nota- 
tion of the previous section, we now define the sequence (R) (instead of a new 
sequence (R*)) by letting R,=P;,,,(c’, d’, e’). In each case we obtain a 
Lucasian sequence over 

Consider now the places of apparition of any prime number ? in (R). 
There are two cases to consider according as p does or does not divide the 


resultant R,,,. 
Case 1. R;,,40 (mod p). Then R,,=0 (mod ) if and only if 


O:,.(a, 8) = 


a; — B; 
Case 2. R;,,=0 (mod #). In this case (7.1) and (7.2) hold modulo # with 
Ry £0 (mod 
f(x) = f'(«)h'(x) (mod g(x) = g’(x)h'(x) (mod 


) = 0 (mod 9). 


We now make use of the following principle: 


SCHATANOVSKI’S PRINCIPLE.| If $(¥1, V2, , Vx) an integral symmetric 
function of the indeterminates yo, , with integral coefficients, and if 
for a natural number m 


f(x) = — — ae) (% — ax) = — (4 — — Ye) (mod m) 


where f(x) is a polynomial with integral coefficients, then 


(7.3) o(ai, ax) = vk) (mod m). 
Let 
yi om B n 
yi — B; 


and let 1, y2, , Ye be the roots of f’(«)h’(x) =0 in a definite order. Then 
on taking m =p, (7.3) gives us 


R, = R,(f, g) = R,(f'l’, g) (mod p). 


t See Schatanovski [1], Lubelski [1], [2]. The principle is also used constantly in Ward [2]. 
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Here and later if R;,, vanishes, we can replace the congruence by an equality. 
A second application of Schatanovski’s principle gives us 


Rn = Ralf, g) = Ral = d’, e’) (mod f). 


Hence we obtain from (6.3) the congruence 
(7.4) Ry = (e’) (mod ). 


In particular then R,=0 (mod p). Since p has no proper divisors and R; = 1, 
we thus obtain the following theorem: 


THEOREM 7.1. p is a rank of apparition of any prime p in (R) which divides 
the resultant R;,, of the polynomials f(x) and g(x) which generate (R). 


Now clearly 
Ck = Ck-m€m (mod p), d; = di_mém (mod p), (cé—m, di_m) # 0 (mod 


Hence én =0 (mod #) if and only if c,=d:=0 (mod 9). 

Also #0 (mod #0 (mod p). If we assume that 
Ry,» =0 (mod p), we have a congruence similar to (7.4) for Px—m.m(c’, e’); 
with an obvious extension of notation 


By what we have just shown, e,=0 (mod #) if and only if e,” =c-m=0 
(mod p). A like result holds if R,-,,,=0 (mod p). Now it is easily seen that 
in case 1, p is not a null divisor of (R). Hence we obtain the theorem: 


THEOREM 7.2. pis a null divisor of the Lucasian sequence (R) if and only if p 
divides the constant terms c, and d, of the polynomials f(x) and g(x) which gen- 
erate (R). 


Hence if # is not such a null divisor of (R), the determination of its places 
of apparition in case 2 reduces by virtue of (7.4) to determining its places of 
apparition in various sequences dividing (R) modulo # but for which p does 
not divide the associated resultant. For (Ward [1] Theorem 6.3) the set of 
places of apparition in the product of two or more sequences is the union of 
the sets of places of apparition in the constituent sequences, and the ranks of 
apparition in the product are immediately specifiable in terms of the ranks 
of apparition in the constituents. It suffices therefore to consider only case 1. 

8. We next prove that it suffices to consider the case when f(x) and g(x) 
are irreducible modulo p. With our previous notation, let p be a prime which 
does not divide the resultant of the generators of (R). Let the decompositions 
of the polynomials f(x) and g(x) modulo p be 


f(x) = fila) (mod p), = ge(x) (mod 


4 

t 

t 

j 

H 
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Here the polynomiials f;(x), - - - , g.(x) have integral coefficients, and are pri- 
mary, irreducible, and co-prime in pairs modulo p. Schatanovski’s princi- 


ple gives us then the congruence 
R, = g) = (forsee fr, gighs (mod p). 


On using the elementary multiplicative properties of resultants (Fricke 
[1 ]) this last congruence may be written 


Hence it follows as in §3 that we may confine ourselves to the case where 
the generators of (R) are irreducible modulo p. 

9. In determining the law of apparition of primes in the sequence (U), we 
can similarly confine ourselves to the case when the generator of (U) is ir- 
reducible modulo p. It would at first appear as if this result were a special 
case of the reduction for (R), since (U) is obtainable trom (R) by setting 
g(x) =df(x)/dx. But the leading coefficient of df/dx is not unity but k, so that 
the primes dividing k would be unclassified by this method. It is however 
possible to parallel the reduction for (R), and the process is so similar that 
we shall merely indicate the main steps. 

We begin as in $6 by considering the effect upon 


(9.1) ¥.(y) = I] (=—**) = 0.(Y) 
\ Vi — 
i<j 
of substituting, in place of y1,---, yx, #4 distinct new indeterminates 


tn, +, tw, +, bx, So that we have 


u=1 i=l 


+ doko + = k, 


and at least one a, is greater than unity. The right side of (9.1) then becomes 
a polynomial in the quantities 7), - - - , JT, defined by 


(x tur) (x = tus) (x tuku) = — 


The value of the left side of (9.1) is then easily found to be 


(9.2) + n'T] T] { Tr) }™, 


u=1 u=1 u,v=1 
u<v 


Au = — 1), = + + 


in analogy with formula (6.3). 


1938] PRIMES IN A LUCASIAN SEQUENCE 


Consider now the sequence (U) of §2 with the generator 


and discriminant 


D; = {+ II 


i<j 
If D; does not vanish, we define the sequence 
(U): Uo, Ui, by Un = Vila) = Q:(c). 
If D; vanishes, we have 
(9.3) f(x) = { fala) fala) 


where fu(«) - =(4—Tu1) and D,, +0, 
#0, We then define U, by means of (9.2) as 


u,v=l 
u<v 


Now consider the places of apparition of any prime # in (U). As in the 
case of (R), there are two cases according as p does or does not divide the 
discriminant D,. 

Case 1. D;#0 (mod p). Then U,,=0 (mod ) if and only if 


Vi(a) = II ( 


Case 2. D;=0 (mod ). In this case (9.3) holds modulo » where we may 
assume that the polynomials f,(x) are irreducible modulo p and relatively 
prime in pairs modulo p. We deduce then from Schatanovski’s principle that 


af — 
) = 0 (mod P). 


aj — a; 


u=1 


This congruence is the analogue of (7.4). We deduce the theorems: 
THEOREM 9.1. pis a rank of apparition of any prime pin (U) which divides 
the discriminant D, of the polynomial f(x) which generates (U). 
THEOREM 9.2. p is a null divisor of the Lucasian sequence (U) if and only 
if p divides the last two coefficients c, and c;—1 of the polynomial f(x) which gen- 
erates (U). 


t 

i 

j 
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Formula (9.5) also shows us that it suffices to consider case 1 for (U) or 
(R). But in case 1 for (U), we have a decomposition (9.3) of f(x) modulo p 
with all the a, unity. Thus a decomposition (9.5) applies with all the a,, a, 
unity, all the 4, zero, and / zero. We thus deduce that it suffices in every case 
to assume the generators of (U) and (R) irreducible modulo p. 

Formula (9.5) shows that the law of apparition of primes in the sequence 
(U) depends on the law of apparition in (R), for each sequence with general 
term P,,,4,(Cu, Cv) is a special (R) sequence. 


III. LAWS OF APPARITION FOR R-SEQUENCES WITH 
IRREDUCIBLE GENERATORS 


10. We shall now determine the law of apparition of primes p in (R) when 
the generators of (R) are irreducible modulo p. 
With our previous notation, let 


f(x) = — — — = (% — (% — ae) (x — ay), 
g(x) = — — --- — dy = (% — Bi)(% — Bs) (x — By) 
be the generators of (R). Both f(x) and g(x) are algebraically irreducible. Let 


R denote the galois field of the roots of f(x) =0 and g(x) =0 obtained by ad- 
joining the k+/ quantities a, - - - , 8; to the field of rationals. 


Lemma 10.1. p is a prime ideal of &. 
Proof. If C is the ring of integers of &, it suffices to show that the quotient 


ring D/[p] is a field. Let R as before denote the ring of rational integers, and 
let a be any root of f(x)=0, 8 any root of g(x)=0. Construct the ring 
0=R[a, Clearly contains 0. Hence [p] contains o/[p]. We shall now 
show that o/[p] contains D/[p] so that 


(10.1) ©/[p] = 
To prove this it suffices to show that every element of D is congruent modulo 


p to an element of o. Let D be the discriminant of the field ®. Then (Hilbert 
[1], Theorem 85, page 144) 


(10.2) (p, D) = 1. 


For since both f(x) and g(x) are irreducible modulo 9, p is prime to their dis- 
criminants. 

We can choose rational integers a,---, ex; fi,---, fr such that 
+eartfibit - - is a primitive element of But we 
have the congruences in O 


(10.3) a; =a” (mod p); Bj; = 6” (mod p), i=1,--- 


, ;=1,---,/, 
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where 7, , 51, are the integers 1,---,;1,---,/in some 
order. Hence @ is congruent modulo # to an element of o. But if v is the degree 
of the field 8 and Das before its discriminant, the v elements D-', @D-, - - - , 
6%-1)-! are a basis of ©. Hence by (10.2), each element of this basis is con- 
gruent modulo # to an element of o. Hence (10.1) follows. 

Now the ring 0/ [p] may be obtained either by first adjoining a and 6 to R 
and then forming the quotient ring, or else by first forming the quotient ring 
R/[p] and then adjoining a and £. Since R/[p] is a field, 0/[p' is conse- 
quently a field, so that by (10.1), O/[p] is a field. 

11. Now assume that for a certain value of 


a; — B; 
Since p is prime to the resultant of f(x) and g(x), we see from Lemma 10.1 
that this congruence can hold if and only if 


(11.1) a” = (mod p) 
in © for some values of the subscripts 7 and 7. 


On multiplying (11.1) by 87", raising to the proper power of #, and utiliz- 
ing (10.3) we obtain as a necessary and sufficient condition that p divide R,t 


(11.2) {ap-?*}" = 1 (mod 9), iss<il. 


Now a@-*" is an element of the finite galois field R* = ® [a, B]/[p] of order 
p” where m is the least common multiple of the degrees of f(x) and g(x). Let 
a, be its period. Then (11.2) holds if and only if 


(11.3) n = 0 (mod o,). 
We thus obtain the following theorem: 


THEOREM 11.1. If o, is the period of aB-** modulo p in O/[p], then 
01, 92, 0, constitute a set of generators for the multiplicative set S, of places 
of apparition of p in (R). 

If we regard the solution of the problem of determining the period of a 
mark in a finite field as known, the law of apparition of p in (R) is determined: 
all ranks of apparitions necessarily occur in the set 1, 2, - - - , 71, and to ob- 
tain them we merely reject all ¢; which are multiples of order a; in the set. 
The rank of # is then the least common multiple of the surviving ¢, and the 
set S, is exactly specified. 

12. From a more realistic standpoint, the period of a mark in a finite 


t If d; is chosen so that dj d;=1 (mod p), an explicit expression for 8~! is given by the congruence 
= dj — d, — — dis) (mod 9). 


— B;” 

H 
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field is not given to us by merely specifying the field and the mark, so that it 
becomes important to reduce the number of crude generators oi, - - - , a; of 
S, as much as possible. Before giving the details of this reduction, we shall 
consider the sequence (U) for the case when its generators are irreducible 
modulo p. 

By a repetition of the arguments applied to (R) in the previous section, we 
deduce that if (U) is generated by a polynomial f(x) which is irreducible 
modulo then 


Um = 0 (mod 
if and only if 
m = 0 (mod p,), 
where p, is the period of a?*-! in the field R[a]/[p], s is an integer >1 and 
<k, the degree of f(x), and a is any root of f(x) =0. 


But if A is the period of a, the period p, of a®*-! is easily seen to be the 
residualt of p*—1 with respect to X. In the usual notation for residuals, 


(12.1) = — 1. 
We observe in particular that 
(12.2) ‘ px = 1, pi 


Here y is the restricted period of f(x) modulo p; that is, the least positive 
integer such that (Ward [2], p. 284) 


a; = = = (mod p). 
Now (Ward [4], p. 627) by (12.1) 


[o., oc] = [Azp* — 1, — 1] = Az (p* — 1, pt — 1) 


since the sequence 0, p—1, p?—1, p’—1,--- has the property that 
(p*—1, pt'—1) =p —1 (Lucas [1], Ward [5], [6]). Thus 
(12.3) los, pe] = 


It follows from (12.3) that if s divides t, then p, divides p,. On taking t=k 
in (12.3) and using (12.2), we see that 


(12.4) Ps = pa where d = (s, k) divides k. 


t For the properties of residuals used here, see Ward [3], [4]. 
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We therefore need consider only periods pz where d divides k. Butt if 
d|d’| k, then pa’ | pa. 

We therefore need consider only periods pa where d divides k and no multi- 
ple of d divides k. On collecting these results, we obtain Theorem 5.1 of the 
introduction. 

Let p* —1: p*—1. Since divides p* —1, A: p*—1 divides p*—1:p*—1. 
(Ward [4] formula (4.51)). Hence p,|ax., (s=1, 2,3, - - -, &). 

We thus obtain from Theorem 5.1 the following result which gives us a 
useful upper limit to the ranks of apparition of p. 

THEOREM 12.1. If f(x) is irreducible modulo p and of degree k, the ranks 
of apparition of p in the sequence (U) generated by f(x) divide the numbers 
p*—1/p"a—1, p*—1/p"a—1, Here * » are 
the x prime factors of k. 

If O=qig2 : - - gx, it easily follows that the rank of p in (U) must divide 
the number p* —1/p*/@—1. 

13. We return now to the reduction of the generators of the places of ap- 
parition of p in (R). With the notation of §10, let y be a primitive element of 
the finite field *. Then 


a=, = "(mod 
where a and 3 are positive integers.t Hence 


(13.1) = — 1:(a — bp’), 


Here m, it will be recalled, is the least common multiple of the degrees of the 
generators of (R). 
We extend the definition of the o, over the entire ring R by letting 


(13.2) Or =, if r = s (mod /). 


The numbers ¢, have the following strange property which stands in re- 
markable contrast to the property of the ranks of apparition of p in (U) ex- 
pressed by formula (12.3). 


THEOREM 13.1. Let p be a prime, let the generators of the sequence be irre- 
ducible modulo p, and let m be the least common multiple of their degrees. Then 
the least common multiples of pairs of generating elements for the places of 
apparition of p in (R) satisfy the relation 


t We use the usual notation a| for a divides b. 
¢ If A: and ¢ are the periods of f(x) and g(x) modulo #, the numbers a and 6 are subject to the 
conditions 


(a, p™ — 1) = p™—1:m, (6, p™ — 1) = p™ — 1:dr2. 


(s = 1,2,---,0). 
a 
4 
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Proof. For convenience write r+s in place of ¢ so that (13.3) becomes 
By (13.1) and elementary properties of residuals 
= p™ — 1:(p™ — 1, a — bp*, a — bpr**), 


13.4 
| = p™ — 1:(p™ — 1, a — a — 


Thus the proof reduces to showing that the two greatest common divisors 
on the right of (13.4) are equal. Now 
(p™ — 1, a — bp’, a — bprt*) = (p™ — 1, p*b(p” — 1), a — bp’) 
= (p™ 1, 1), a— bp*) 
since (p’, p"—1)=1; and we obtain 
(p™—1, a—bp*, a—bpt*) =(p"—1, b(p™” —1), a—bp*) 
since (pr—1, p™-—1)=p™"—1. Hence since (p-™”, pn—1)=1 and 
(p*, 1) 1, 
(p™ 1, bp’, bp’**) = (p™ 1, pom) 1), bp*) 
(p™ 1, a— => bp’), 
(p" — 1, a — bp*, a — bprt*) = (p™ — 1, p*b(p'™-”) — 1), a — bp*) 
(p™ -i,e= a— bp*). 
It follows from (13.3) that the /(J—1)/2 least common multiples [o., o:], 
(s,t=1, - - - ,1;s<d#), may be grouped into a certain number of sets such that 


all the members of a set are equal to one another.* For example, if /=6, k =2, 
we find that the fifteen least common multiples are grouped into six sets: 


[o1, 03] = [o1, 05] = [o, 05]; [o2, = [o4, = [o2, o6]; 
a5]; [os, a6]. 


The case when there is only one such set is of particular interest on ac- 
count of the following easily proved theorem: 


THEOREM 13.2. If all of the I(l1—1)/2 least common multiples [o., o.| are 
equal to one another, then if there is more than one rank of apparition of p in 
(R), the rank of p in (R) is the least common multiple of the two smallest o:. 
If the smallest o, divides the next smallest, there is only one rank of apparition. 


* But not necessarily unequal to least common multiples in other sets. 
+ It must not be supposed that there are at most two ranks of apparition. For instance if /=3, 
we might conceivably have o;=6, 10, ¢;=15. The least common multiples [o,, then equal 30. 
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It can be shown from (13.3) by a simple enumeration that the hypothesis 
of the theorem is satisfied if /=2, 3 or 5; 1=6 and k=0 (mod 4); /=7 and 
k#£0 (mod 60). 

14. If we raise the congruence a*=6" (mod p) to the p'th and p*th 
powers successively, we obtain (mod (mod ). 
Hence if \; and Az denote the periods of f(x) and g(x), 


m=0O(moddi:p'— 1), m= 0 (mod drz:p* — 1), 


where we are using the notation already employed in §12 for residuals. 
Now p'—1) p*—1, p'—1) since A; divides p*—1. 
But (p*—1, p'—1)=p%—1. Hence 9-1. Similarly 
ho —1. Hence n=0 (mod —1, —1]) or 
(14.1) n = 0 (mod Ao]: — 1). 

(14.1) gives us a lower limit for every rank of apparition o of p in (R) in 
terms of the periods of the generators of (R). An upper limit may be obtained 


as follows: 
If 41, we denote the restricted periods of f(x) and g(x) respectively; then 


a; =a: =--- =a; =a(modp), =--- = 6; (mod p), 


where a and 3 are rational integers. Then if ¢ is the least positive value of x 
such that a*=b? (mod p), every other such x is easily shown to be divisible 
by ¢. Now ¢ as a divisor of p—1 is relatively prime to the restricted periods 
1 and pe (Ward [5]) and hence relatively prime to their least common multi- 
ple [u1, ue]. It readily follows that the least positive value of n such that 


(14.2) af =--- = pf = pf =--- = (mod p) 


is p=, ue]. Every other such n is divisible by p. Since (14.2) is satisfied for 
n= Ae] we see that Ae [u1, we]. 

It is now easy to show (compare M. Hall [1] or Ward [2]) that every rank 
of apparition of p in (R) divides u. We thus obtain the following theorem: 


THEOREM 14.1. Let the generators of (R) be irreducible modulo p with de- 
grees k and | and with periods and restricted periods dx, 4, and do, Me respectively. 
Then for every rank of apparition o of p in (R), 

(14.3) Ae}: — 1) 
divides a; o divides Here divides we], and we] 
is the least positive value of n such that the congruence (14.2) holds. 

In particular if / and are co-prime, [Ax, —1 = we]. Hence if 
is a rank of apparition of p so that a,°=8;7 (mod p), (14.3) implies that yu 
divides 


i 


84 MORGAN WARD [July 


THEOREM 14.2. If the generators of (R) are irreducible modulo p and if their 
degrees are relatively prime, there is only one rank of apparition of pin (R). This 
rank is the least positive value of n such that the congruence (14.2) holds, and it is 
a multiple of the least common multiple of the restricted periods of the generators 
of (R), and a divisor of the least common multiple of their periods. 


IV. APPLICATIONS TO GENERAL LUCASIAN SEQUENCES 


15. We shall now prove Theorem 2.1 of the introduction. Let (): 
Uo, M1, U2,-~- be a Lucasian sequence belonging to the polynomial 
f(x) =x*— -- —c,=(x—ay) - - - and let p be any prime dividing 
neither its constant term* c, nor its discriminant D =D; = +] J i<;(ai—a;)?. 

Let & now denote the galois field of the roots of f(x) =0 and p a prime 
ideal divisor of p in R. Then the general term w,, of (u) is of the form 


Un = Aja” +--+ + Ara’, 


where DA,,--- DA; are integers of so that Ai, ---, A, are integers 
modulo p. Since (x) is a divisibility sequence, u,=0 (mod ) if and only if 
+ Asag”™ + = 0 (mod p), m=1,2,---,k. 


Thus the determinant of this system of congruences must be divisible by p. 
This determinant may be written c;"]]:<;(ai:—a;)U,.. Since p is prime to the 
first two terms, U,=0 (mod p) so that U,,=0 (mod p). Hence every place of 
apparition of p in (x) is also a place of apparition of p in (U). 

16. Suppose that the & (not necessarily distinct) mth powers of the roots 
of f(x) =0 are grouped modulo » into ¢ incongruent sets: 


n 


(16.1) ai, = a, = = (mod p), i= 1,2,---,8, 
Furthermore let 
(16.2) Ai = Ay + Ag + 
THEOREM 16.1. Any prime p which does not divide the discriminant of f(x) 
divides a term u,, of the Lucasian sequence (u) belonging to f(x) if and only if 
A; = 0 (mod p), (¢ = 1,2,---,2é). 


Here A; is given by formulas (16.1), (16.2)} and pis any prime ideal divisor of p 
in the Galois field of the roots of f(x) =0. 
* If we are willing to assume that u9=0, we may dispense with this first assumption. Marshall 


Hall [1] has shown that uo is usually zero. 
+ The groupings of the roots in (16.1) iepend of course on our choice of p. 


1938] PRIMES IN A LUCASIAN SEQUENCE 85 


Proof. See Ward [2], pp. 284-285. 
We may make this result more explicit by the use of Schatanovski’s prin- 
ciple. Suppose that the decomposition of f(x) modulo p is 


f(x) = filz)fo(x) - f-(x) (mod 9), 


where f;(x) is primary and irreducible modulo p and of degree k;, and let the 
roots of f;(x) =0 bey, yo, - 
Then by Schatanovski’s principle 


(1) (2) 
tn = Uy + Uy 


(mod 


where 


(i) (i) (4) 


Un =I { ye; \" 


satisfies the difference equation associated with f“(«) and 


= uy; ) 
(Ward [2], p. 283). 

Construct the galois field 2=R (yi, - - - , vx,” ), and let M be the ring 
of integers of 2. Then as in §10, p is a prime ideal of &, for p is prime to the 
discriminants and resultants of all the f;(x). Furthermore the ring &/[p] is a 
finite field of order p” where a= [h, ke, - - , Rr]. 

Suppose that in It the mth powers of the roots of f;(x) =0,- - -,f-(x) =Oare 
grouped modulo into incongruent sets as in (16.1) so that we have, omitting 
subscripts, 


(16.3) = {y@}» (mod 9), i 
Then we deduce as in §14 that 
(16.4) n = 0 (mod — 1). 


Here and are the periods of and f‘? (x) modulo 9, and &; and k; 
their degrees. 

In particular, if k,) =1, then A]: = ], where 
nu‘ and pw“ are the restricted periods of f‘®(x) and f‘?(x). Now un divides 
p**—1/p—1, divides p*i—1/p—1, and 


jut 


Hence we obtain from (16.4) the following theorem: 


{ 
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THEOREM 16.2. If the degrees of f‘»(x) and f‘?(x) are relatively prime to 
one another, then the congruence (16.3) can hold if and only if nis divisible by the 
product of the restricted periods of f‘* (x) and f(x). 


In the simple case when f(x) is irreducible modulo , we easily find as in 
§12 that uw, =0 (mod p) only if (mod p¢—1).* Here d is some divisor 
of k and ) is the period of f(x). In particular then if & is a prime number, 
there is only one rank of apparition of p in (w), the restricted period of (). 

It seems unprofitable to investigate the law of apparition in general 
Lucasian sequences in very much greater detail until it is definitely known 
whether or not Lucasian sequences exist which cannot be exhibited as di- 
visors of R-sequences. 
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STOCHASTIC PROCESSES WITH AN INTEGRAL- 
VALUED PARAMETER* 


BY 
J. L. DOOB 


The purpose of this paper is to set up the measure relations of the most 
general stochastic process and to discuss the properties of the conditional 
probability functions of the processes depending on a parameter running 
through integral values. In particular, the study of temporally homogeneous 
processes of this type is shown to be essentially the study of measure pre- 
serving transformations. The well known results in the latter field are applied 
to develop and extend the theory of Markoff processes from a new point of 
view. 

Throughout the paper, any non-negative completely additive function of 
point sets, defined on a Borel field of setst of some abstract space © will be 
called a probability measure if the space Q is itself in the field of definition 
and if the set function is defined as 1 on 0. 

1. Probability measures defined on spaces of infinitely many dimensions. 
Let Q (X) be any abstract space, consisting of elements wy (x), and let 
F.., (Fz) be a Borel field of subsets of 29(X). We shall suppose that  ¢€ F..,, 
and X ¢ F,. If f(wo) is a function defined on Q) which takes on values in X, 
and if the Q,-set defined by the condition f(wo) ¢ E is in F.., for every set E of 
F,, then the function f(w») will be called measurable on Q. Let {fn(wo)} be 
any sequence of such measurable functions, where the subscript ranges 
through any aggregate <4, not necessarily denumerable. If a probability 
measure Po(Ao) is defined on the sets Ap of F.,,, the measurable function f,(wo), 
considered from the standpoint of probability, is a chance variable x,. The 
following method of analyzing the mutual relations of such a set of chance 
variables has been used, more or less explicitly, in recent years.{ Consider 
the space 2 whose points are the aggregates w: {x,}, 1 €c4,x, €X.§ If m runs 
through all real numbers #, 2 consists of all functions of the real variable ¢, 


* Presented to the Society, April 9, 1937; received by the editors July 26, 1937. 

{ A field of sets is a collection of sets E containing, with E; and E2, their sum £,+£; and differ- 
ence E2. A Borel field of sets is a field which contains with E2, - their Ej. 

t Cf. Doob (I); Hopf (I); Khintchine (I); Kolmogoroff (II, pp. 24-30); Lévy (I and numerous 
papers); Lomnicki and Ulam (I); Paley and Wiener (I, chaps. 9 and 10, and earlier papers by Wiener). 
The Roman numerals refer to the bibliography at the end of the paper. 

§ Each subscript » determines a coordinate x,, and the space © is thus a Cartesian space with 
a dimension corresponding to each element of ¢/. 


87 


88 J. L. DOOB [July 


with range in X; if ¢4 is the set of natural numbers, 0 is the space of all se- 
quences (x1, %2,-- +), Xn € X; if is the set of all integers -- - , —1, 0, 1, 

- + + , Qis the space of all sequences ( - - , Xo, 41, - - - ), € X. The last 
example will be the one studied in later sections, but in the present section 
no restrictions will be made on ¢/4. Let ai, - - - , a» be any finite set of sub- 
scripts, and let £,, - - - , Z, be sets of F,. The conditions 


(1.1) € E;, j=i,---,p, 


determine a set of elements of 2. The class of all Q-sets defined in this way 
determines a Borel field F,, of sets of 2.* Evidently 2 € F.,. We shall define a 
probability measure P(A) on the sets A of F., which will have as its value, on 
the set defined by (1.1), the Po-measure of the Q-set determined by the con- 
ditions 


(1.2) fa;(wo) € Ej, j=ui,---,p. 


The P-measure on © is defined by means of a mapping of 2 on Q, which 
takes the sets of F.,, into sets of FP... Let wo be a point of Qo. The map takes 
w into the point (&,) of 2 defined by the equalities 


(1.3) = fn(wo), neA. 


To the Q-set determined by the conditions of (1.1) will then correspond the 
Qo-set determined by the conditions of (1.2). Then to any set A of F, will 
correspond a set A» in the Borel field of sets determined by those sets which 
are defined by conditions of type (1.2). Since the latter sets are in F,,,, Ao € Fu. 
We define P(A) as Po(Ao). In this way the study of the mutual measure rela- 
tions of the aggregate of functions {f,(wo)} (the probability relations of the 
aggregate of chance variables {x,}) is reduced to the study of the properties 
of the space &. The earlier representation of the chance variable x,, by means 
of the function f,,(wo) defined on Q has been replaced by a new representation 
by means of the function x,,(w), defined on 2 and taking on the value £,, at the 
point w: («,) for which the mth coordinate is £,,.T 


* The (Borel) field determined by a given collection of point sets can be defined as the intersec- 
tion of all the (Borel) fields of sets which include the sets of the given collection. 

+ The measure relations of 2 correspond to similar relations between the chance variables {x;} ; 
that is, between the original functions f,,(wo) in the sense that, if A is an 2-set in the field F,, the corre- 
sponding {-set in the field F.,, is defined by conditions on the f’s which, when imposed on the x’s 
define A; and P(A) =Po(Ao). Due to the fact that the transformation from Qj to © is not one-to-one, 
certain relations of the functions {f,(wo)} may become distorted; thus an Q-set in the field F.,, may 
not go into an -set in the field F,,. For example, if ¢4 is the set of real numbers #, so that 2 becomes 
the space of functions x;=x(¢), and if X is the space of real numbers, it may be that f;(wo) is a con- 
tinuous function of ¢ for all wo; on the other hand the set of elements x(t) of 2 which are continuous 
functions of ¢ will never be measurable (in terms of P-measure). Cf. Doob (II) for a detailed treatment 
of this case. 
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As an example of the advantages of this procedure, we give a discussion 
of the following classical theorem: 


If x1,- ++, xn is a set of mutually independent chance variables, the ex- 
pectation of their product is the product of their expectations.* 


In order to treat this theorem we take X as the space of real numbers, 
F, as the field of Borel sets of X (or some more inclusive field), and <4 as the 
set of integers 1,---, N. The space 2 becomes ordinary N-dimensional 
cartesian space. A probability measure is defined, in terms of the measure 
properties of x; on the x;-axis,f and the P-measure on Q is determined in the 
usual (multiplicative) way from these separate measures.{ The theorem in 
question is now an immediate consequence of Fubini’s theorem on the equal- 
ity of a multiple integral and the corresponding iterated integrals.§ Inci- 
dentally Fubini’s theorem provides very sensitive sets of possible hypotheses: 
it is sufficient to suppose that the expectation of every x; exists, or else that 
the expectation of x, - - - xy exists. 

In the preceding discussion, the given aggregate of chance variables was 
considered in a given representation in terms of a corresponding aggregate 
of measurable functions {f,(wo) }, all defined on the same space 2; and a new 
representation was obtained in terms of the aggregate of functions {x,(w) } 
defined on ©. If the chance variables are not given in some such representa- 
tion, the problem becomes more difficult. A family of chance variables is 


generally considered as a family of entities {x,}, distinguished by a subscript 
n (which is usually identified in some way with the time) varying in an ag- 
gregate 4. The chance variable x,, which takes on values in a space X, is 
considered defined by a physical process in the course of its development.|| 
More specifically, it is supposed that there is a Borel field F, of sets of points 


* If a numerically-valued chance variable x is represented by a measurable function f defined on 
a space on which some measure is defined, and if f is absolutely integrable, the expectation of x is 
defined as the integral of f. In treating this theorem we shall assume that the N chance variables are 
represented by N numerically-valued functions defined and measurable on a space on which some 
measure is defined. The fact that such a representation is always possible will appear later in this 
section. A recent proof of the theorem in question, with somewhat more stringent hypotheses than 
those to be given, and with the chance variables represented by Lebesgue measurable functions 
defined on the interval 0<x<1, was published by Kac (I, pp. 47-50). 

¢ The measure of the interval a<2;<b is defined as the probability that a<x;<b. 

t Cf. Saks, Théorie de l’Intégrale, Warsaw, 1933, pp. 257-263, for the details for N=2. The 
P-measure is determined by the fact that the P-measure of the N-dimensional interval a;<x;<b;, 
N, is defined as the product of the (1-dimensional) measures of the sides a; <x; <b; pre- 
viously defined. 

§ Saks, ibid., p. 262. 

|| Thus the chance variable x; may be the x-coordinate of the position of a particle (in a Brownian 
movement) at time ¢. 
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of X such that X is in F,, and that if a, - - - , a» are elements of ¢4, and if 
E,, ---, £,are in F,, a non-negative number is assigned to the p conditions 
(1.4) Xa; € E;, 
This number is called the probability that the conditions of (1.4) are satisfied. 
If ai, - - - , a, are kept fixed, these probability numbers assign measures to 
certain sets of the space of points (x«,, - - - , X«,), the sets being those deter- 
mined by conditions of the form 

(1.4’) E;, 


It is supposed further that this (p-dimensional) measure function is additive 
for fixed subscripts a, - - - , a», and that it can be defined on every set of the 
Borel field F.,,...,2, (the field of Q-sets determined by the sets, defined by 
(1.4’), on which the function is already defined), in such a way that the ex- 
tended set function is a (p-dimensional) probability measure. Now consider 
the space 2 and field F., as described above. An Q-set, determined by condi- 
tions imposed on a certain set of coordinates, will be called a cylinder set over 
those coordinates. It is readily shown that the field F.,,...,2, is the field 
of cylinder sets of F., over %4,, , What was just described was there- 
fore the determination of a probability measure on the cylinder sets of F., over 
%a;,; °° * » Xa,- Moreover the various measures, obtained by varying the co- 
ordinates involved, are coherent in the sense that if A is a cylinder set of F., 
Over Xa,, Xa, and also a cylinder set over - - , %s,, then the proba- 
bility measures, assigned to A in the two representations, will be the same. 
To show that the present situation is no more general than that described 
above, when a probability measure was defined on all the sets of F., not 
merely over the cylinder sets of F,, over a finite number of coordinates, it is 
necessary to prove the following theorem: 


THEOREM 1.1. A set function, defined on every cylinder set of F, over a finite 
number of coordinates, which is a probability measure on the field of cylinder 
sets of F., over each such finite set of coordinates, can be so defined on the remain- 
ing sets of F., that it becomes a probability measure on this field. 


This theorem was proved by Kolmogoroff (I, pp. 27-30) under the addi- 
tional hypothesis that X is the set of real numbers, and that F, is the field of 
Borel sets of X. Daniell (I) discussed measures on Q in the case where Q is 
the set of natural numbers (with X, F, defined as in Kolmogoroff’s case) so 
that © is the space of sequences w: (*1, x2, - - - ). This latter case, for which 
in addition the chance variables concerned are independent, has been dis- 
cussed by many other writers who map 2 on a linear interval and define the 
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measure of an Q-set by means of the Lebesgue measure of the corresponding 
set on the interval.* 

Let F be the collection of 2-sets each of which is determined by conditions 
of the form (1.4’) or is a finite sum of such sets. Then F is a field, and the 
given set function P(A) is defined on the sets of F. Let Ao, Ai, - - - be sets of 
F. Then if Ay, As, - - - are disjunct, and if, in addition, 


(1.5) Ayo = Am, 
1 

we shall show that 

(1.6) P(Ao) = >> P(An). 
1 


We prove (1.6) by reducing it to the corresponding result in the special case 
considered by Kolmogoroff. Fix a value of m, m=v, and consider the sets 
EY, ES”, -++ of X which are involved in the restrictions on x, used to 
define Ao, Ai, - - We shall define a numerically-valued function f,(x), with 
domain X, so that each set EZ,‘ is determined by simple inequalities imposed 
on f,(x). In order to define the function f,(x) we shall need the lemma which 
follows: 


Lemma 1.1. Let €,, Es, - - - be any point sets of an abstract space. There is a 
collection of sets {E, } (where r is rational and 0 <r <1) with the following prop- 
erties: 

(i) each €; is in the field determined by the sets E,, and conversely; 
(ii) if n<re, E,, E,,; 

(iii) if fm—r, where - 

(iv) if rm—0 (rm—1), then =0 


This lemma can be proved by a modification of the proof of a similar re- 
sult due to von Neumann.§ We do not suppose that there are necessarily 
infinitely many distinct sets E;. In the contrary case, there will be only a 
finite number of distinct sets E,’. 

Using this lemma, we define f,(x) as follows: Identify the sets {E, } 
(v fixed) with the sets {€;} of the lemma, and set 


(1.7) f.(«) = lim sup r, 


* The details of such a map can be found in Paley and Wiener (I, pp. 145-146). Lomnicki and 
Ulam (I) treat the case for whicheA is the set of natural numbers, and the chance variables concerned 
are independent (with no restriction on X); but their proof of the theorem stated above (Theorem 1.1) 
is defective. (The mistake is in the proof of Lemma 4, pp. 254-255.) 

¢ Cf. Kolmogoroff (II, pp. 25-26). 

t Except for a denumerable set of superscripts v, there will be only one set E;”; namely X itself. 

§ Annals of Mathematics, vol. 33 (1932), p. 602. 
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The set E,’ is characterized by the inequality 
(1.8) fAx) & 


Since every set €; is in the field determined by the sets E,’, every set Ej” is 
characterized by a finite number of inequalities imposed on f,(x).* Moreover, if 
E is any Borel set of real numbers, the x-set E determined by the condition 
f(x) ¢ Eis in the field F,. The latter fact is apparent if E is an interval, and its 
truth then follows for E any Borel set. 

Now consider the space 2 of points &: {%,}, where Z, is any real 
number.} Let F; be the field of Borel sets of the X-axis, and let Fz be the Borel 
field of 2-sets determined by F; in the same way that F,, is determined by F,. 
We map 2 on &, sending the point (x,) of Q into the point (%,) of @ for which 


(1.9) = mecA. 


This mapping is a single-valued transformation of 2 into some subset of 2. 
If A is the {2-set determined by the conditions 


(1.10) Xa; € E;, 
where &,, - - - , E, are Borel sets, the corresponding Q-set A is determined by 
the conditions 

(1.11) fas(x;) E;, 
It then follows from the definitions of F,, and F; that the Q-sets going into 
cylinder sets of Fz over %a,, a, are cylinder sets of over Xa,, Xap; 


and that the Q-sets going into the sets of F are sets of F.. We now define 
a set function P(A), on the sets A of F; which are cylinder sets over a finite 
number of coordinates, by setting P(A) =P(A), where A is the Q-set contain- 
ing every element w which is taken by the transformation into an element & 
of A. The set function P(A) is uniquely defined and is a probability measure 
on the field of cylinder sets over any fixed finite set of coordinates. According 


to the definition of &, there are sets Ao, A,, - - -_ to which correspond the sets 
Ao, Ai, +--+ of (1.5). The sets Ai, Ao, - - - are disjunct, and 
(1.5’) Ay = > An. 

1 


Now we are assuming Kolmogoroff’s result that Theorem 1.1 is true if X is 


* One can readily determine these inequalities explicitly, using the fact that any set in the field 
determined by the sets { €,’} can be written in the form E,,+(E,,—E,,)+ «+ + +(Ery—Ery_,), with 
ro<miSr2S +++ Sry, orelse in the same form without the first set Z,,. 

t The space 2 is the space of numerically-valued functions with domain </.. 
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the space of real numbers, and if F, is the field of Borel sets. Then P(A) must 
certainly be completely additive on its domain of definition, so that 


(1.6’) 


and this is equivalent to (1.6), since P(A,,) =P(Kn), m=0. 

The proof, that the domain of definition of P(A) can be extended as de- 
scribed in Theorem 1.1, is now immediate. By hypothesis, P(A) is an additive 
function of sets on the field F,* and the result just proved shows that P(A) 
is completely additive on this field. It is then possible, according to a well 
known extension theorem,f to extend the definition of P(A) to all the sets of 
the Borel field determined by the sets of F (in this case the Borel field will 
be F..), in such a way that P(A) becomes a completely additive function of 
sets on the larger field. The set function thus obtained is the probability 
measure described in Theorem 1.1. 

2. Definition of a stochastic process. We can now state the definition of 
a stochastic process (of the type to be studied in this paper) suggested by the 
preceding section. Let X be any abstract space of elements x, and let 2 be 
the space whose elements w are sequences (- - - , X-1, Xo, %1, - - - ) of points 
of X. Let F, be a Borel field of sets of points of X, including X itself, and sup- 
pose that £,, - - - , EZ, are sets in F,. The conditions 


(2.1) Xa; Ej, 


(a1, - - -,a,any p distinct integers) determine a cylinder set over %a,, 
The class of all such cylinder sets determines the Borel field F,, of 2-sets. 

Let P(A) be a probability measure defined on the sets of F,. For a fixed 
set of coordinates Xa,, P(A) becomes a probability measure defined 
essentially on the p-dimensional space of elements (*2,, - - - , Xa,), and the 
converse (Theorem 1.1) is also true. A stochastic process depending on the 
parameter # running through integral values is the combination of the space 
Q together with a probability measure defined on the sets of a field F... More 
precisely, the process is the changing real entity of which the above is the 
mathematical abstraction. Examples of stochastic processes are given in §5. 

The function x;(w) taking on the value x; at the point w:(---,2;,---) 
is a measurable function on Qf taking on values in X; and the sequence of 
functions - - - , x1(w), %o(w), x:(w), - - - can then be considered as a repre- 
sentation of a sequence of chance variables - - - , x-1, Xo, x1, - - -. Conversely, 

* The field F was defined at the beginning of the proof. 


¢ Cf. H. Hahn, Annali delle Universita Toscane, Pisa, (2), vol. 2 (1933), pp. 433-436. 
t Cf. §1. 
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we have seen in §1 that any such sequence of chance variables can be repre- 
sented in this inal It is sometimes useful to consider the sequence of chance 
variables x1, x2, - - -. To do this we need only restrict our attention to the 
cylinder sets of F., over %1, %2, - - - 

We shall suppose throughout this paper that the probability measure 
P(A) is so extended that it is defined on every set A, differing from a set Ao of 
F., by at most a subset of a set on which P(A) vanishes, if we set P(A1) =P(Ao). 
The sets of this extended domain of definition will be called P-measurable. 
The subsets of a set of P-measure 0 are measurable and of P-measure 0. The 
P-measure of a P-measurable set is the greatest lower bound of the P-meas- 
ures of sets M2 A which are finite or denumerably infinite sums of sets of the 
field F defined above.{ It follows from this fact that if A is P-measurable, to 
every positive number ¢ corresponds a set A, which is a cylinder set of F., over 
a finite number of coordinates, and which has the property that P(A-CA,) 
+P(CA-A.)<e.t If A is a P-measurable cylinder set over 
it is determined by a condition of the form (*.,, - - - , %a,) € E, where E is 
a p-dimensional set of points (%.,,---, %a,). The set £ will be called an 
»X%a,)-set of P-measure P(A). 

Let f(w) be a numerically-valued function of w. If for every number & the 
inequality {(w) > defines a set of F., (a P-measurable set), f(w) will be called 
measurable with respect to F., (P-measurable). If f{(w) is measurable with re- 
spect to F,, then it is P-measurable; and conversely if f(w) is P-measurable, 
there is a function f;(w), measurable with respect to F., and equal to f(w) ex- 
cept possibly on a set of P-measure 0.§ This can be deduced from the following 
fact (which in turn follows readily from the approximation of P-measurable 
sets by means of cylinder sets of F, over a finite number of coordinates) that 
if {(w) is any P-measurable function, to every positive number ¢ corresponds 
a function f.(w), measurable with respect to F,, depending on only a finite 
number of coordinates, and having the property that | f(w) —f.(w)| <« except 
perhaps on an Q-set of P-measure <e. Throughout the above if the given 
function depends only on some given set of coordinates, the approximating 
functions can be supposed to depend only on the same coordinates. If f(w) is 
measurable with respect to F.., and if {,;} is any set of integers, f(w) becomes 
a function of the coordinates x,,, %,,, --- only, if the other coordinates are 


* Cf., however, the note on p. 88 in accordance with which it may sometimes be necessary to 
define a stochastic process using a subspace of 2 rather than itself, as in Doob (II). 

t The extension theorem used in the proof of Theorem 1.1 defines P(A) in precisely this way. 

t The complement of any set A will be denoted by CA throughout this paper. 

§ The corresponding theorems for Borel and Lebesgue measurable functions are discussed by 
de la Vallée Poussin in his book Intégrales de Lebesgue, Fonctions d’ Ensemble, Classes de Baire, 2d 
edition, Paris, 1934, pp. 34-40. 
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held fast, and this function of x,,, x,,, - - - will be also measurable with re- 
spect to F,,. 

In later sections we shall use the fact that if F, is the Borel field deter- 
mined by some denumerable collection of its sets, the same will be true of F.. 
Even without this hypothesis, it can be shown (by transfinite induction) that 
any given set A, in the field F.,, is in the field F.’ corresponding to some Borel 
field F/ ¢ F,such that F,’ is the Borel field determined by some denumerable 
collection of its sets. It then follows that if f(w) is a function measurable with 
respect to F,,, a subfield F/ of F, can be found which satisfies the denumera- 
bility condition just described and is such that f(w) is measurable with respect 
to the corresponding field F,’ . 

The integral of the P-measurable function f(w) over a P-measurable set A 


will be noted by 
soar; 
A 


and if the domain of integration is not stated explicitly, it will be understood 
to be Q. If depends only on a finite number of coordinates - Xa,; 
and if A is a cylinder set over those coordinates, we shall use the notation 


A 


for the integral of f(w) over A. Corresponding notation will be used for in- 
tegration with respect to other probability measures. 

Let Tw be the transformation taking w: (- - - , 41, ) into w’: 
(--++,%2, %0,-- ), that is, the transformation defined by 


xj = Xj-1, + 1, £2,---. 


If Tw is measure preserving, the process is called temporally homogeneous. 

3. The conditional probability functions.* Let A be a P-measurable set. 
The conditional probability function » Xap; A) is defined 
as follows. If the set M is allowed to range through the P-measurable cylinder 
sets Over Xa,, °° * , Xa,, P(AM) is a non-negative completely additive func- 
tion of sets M which vanishes whenever P(M) =0. There is therefore a func- 
tion *a,; A), a non-negative P-measurable function 
depending only on the coordinates - , Xa,, satisfying 

* The ideas in this and the following section are not new, but there seems to be no systematic 
presentation of many of them in the literature, and some of the theorems have not been previously 
stated explicitly. (Known theorems and definitions are stated for later reference.) The importance 


and usefulness of the conditional probability and conditional expectation functions have been 
stressed most by P. Lévy. 
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(3.1) f *** » = P(AM) 
M 


for every set M.* The function Pa,,...,a,(%a;, °° *» Xa»; A) is determined 
uniquely up to an (*a,,- ~:~: , Xa,)-set of P-measure 0. For a given set A, 
Pray, ++ * %2,3 A) is called the conditional probability of A if 
Xa;=*Xa;,j =1,---, p. The subscripts determining the function are given by 
the subscripts in the argument, so there will be no danger of confusion if 
* * * » ay; A) is written simply as P(x.,, - - - ,a,;A). We shall 


need the following properties of P(%.,, Xa,; A) which are easily derived 
from the definition :T 

(i) If P(A)=1, then P(xa,,---, %a,; A)=1, except possibly on an 
* a,)-set of P-measure 0. If P(A) =0, then P(xa,, , =0, 
except possibly on an (%q,, , Xa,)-set of P-measure 0. 

(ii) If Ay, Ae, - are disjunct P-measurable sets, and if 

1 

except possibly for an (%4,, ,Xa,)-set of P-measure 0. 


This implies the following fact: 
(iii) If A’, A’ are P-measurable, and if A’ £ A”, then 


P( Se; Bas; A’) < P(%a:; A”) 
except possibly for an (%2,, - , ¥a,)-set of P-measure 0. 


By taking complements in (ii) we obtain the following property: 
(iv) If Ai, As, - - - are P-measurable sets, and if 


J[An=A, 
then 
lim Am) Pl tex, » Xap; A), 


m2 


except possibly for an (%a,, , Xa,)-set of P-measure 0. 


THEOREM 3.1. Jf F, is any Borel field of P-measurable sets, including the set 
Q, determined by a denumerable subcollection Ay, Ac, - , and if a, +++, 
are any given subscripts, then P(xa,, Xa,;A) can be so defined that there is 
an (*a,, * Xa,)-set E of P-measure 0 such that for (%a,, ;Xa,) fixed, not 
in E, P(%a,, * (A is a probability measure. 


* This definition is due to Kolmogoroff (I, pp. 41-44). 
t Cf. Kolmogoroff (II, pp. 43-44). 
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If we identify the sets A, Ae, - - - with the sets €,, 2, - - - of Lemma 1.1, 
we obtain sets A,’ corresponding to the sets €,’ of that lemma. We then map 
Q on the ¢-interval 0 <‘<1 by the transformation which takes a point w into 
t if 

t=L. U.B.r. 
w¢A,’ 
According to properties (i), (iii), (iv) of the conditional probability functions, 
if r, s are rational, with r <s, there is an (%a,, - - - ,Xa,)-set E,, of P-measure 0 
such that if (%2,, - ,Xa,)¢Ers, 


(3.2) P(%a1; Vays Ay’) Vays A;), 


and there is an (%a,, - : - ,Xa,)-set E, of P-measure 0 such that if r’ approaches 
r from above (r, r’ rational, r=0), and if (%a,, - - , Xa,)¢E,, then 


(3.3) lim P(%a,° Vays = Ay) 


(where if r=0 the right side is replaced by 0). Let E’ be defined by 


and suppose that (%.,, - , Xa,) is fixed, not in EZ’. Then 
F(r) = P( ° Gay; Ar ) 


is a monotone non-decreasing function of r, defined for rational values of r 
between 0 and 1 and continuous on the right at these values. Define F(¢) for 
every value of ¢ in the interval 0 <i<1 as lim,., F(r) (r rational, r >¢). This is 
consistent with the previous definition if ¢ is rational, and F(r) thus becomes a 
monotone non-decreasing function F(é) defined tor 0 <¢<1 and continuous on 
the right. There is a non-negative completely additive function of Borel sets 
on the #-interval (0, 1) determined by the condition that its value on the 
interval 0 <t<r is F(r).* The Borel ¢-sets correspond to the sets of a certain 
Borel field F of Q-sets, under the transformation from w to /, and a non-nega- 
tive completely additive set function P(xa., +++ ,%q,;A) is thereby defined on 
these Q-sets. The Borel sets of the interval (0, 1) are the sets of the Borel 
field determined by intervals of the type 0<¢<r, for r rational, so that the 
sets of F are the sets of the Borel field determined by the images of such in- 
tervals. The image of the interval 0<é<r (r rational) is the set A,’ , so that 
the field F includes every set A,’ and therefore every set A,; F 2 F,. By defini- 
tion of %a,;A), if ris rational, 

* J. Radon, Sitzungsberichte der Akademie der Wissenschaften, Vienna, class Ila, vol. 122 
(1913), pp. 1305-1313. 


98 J. L. DOOB [July 


(3.4) » Ar ) = P(%a,°°° Ar). 


We deduce from this, using the properties of the conditional probability func- 
tions listed above, that if A is a set in the field F;, 


(3.5) » Vay; A) = P(%a,°°° » %a,; A), 


except perhaps on an (%a,,°-*, %a,)-set of P-measure 0; and to define 
P(%o,,° °°, Xa,; A) as required in the statement of the theorem, we need 
only re-define P(x.,, ,%a,;A) as P(%a,, A). 

Tf * * =G((Xa), (%g)) is a P-measurable and in- 
tegrable function depending only on the indicated coordinates, we now define 
the function 


(3.6) E(xa, * Beas ¢) foo, (%3)) P(x, 


not as an integral, but in such a way that the indicated integration actually 
gives E(x.,, - - - , Xa,;) when it can be carried out. Let M. be a P-measur- 
able cylinder set over Then E(%a,,-- , Xa,; is defined as 
the P-measurable integrable function which satisfies 


for all sets Ma. The function E(xq,, - - - , %a,;¢) is known as the conditional 
expectation of ¢ for given (*.,, - - - , X«,).* Changing ¢ on a set of P-measure 
0 does not affect E(xa,,--- ,Xa,;%). If ¢ is the characteristic function of a 
P-measurable cylinder set over - - , 


Vays = Vay; A). 


THEOREM 3.2. Suppose that P(%a,,-- +, Xa,; A) can be defined so that 
for each not in some Xa,)-Set of P-measure 0, 
P(%a,, °° * 5 Xa,;A) becomes a probability measure for A in the field of cylinder 
sets of F., over (%s,, , %s,). Then (3.6) can be interpreted as ordinary integra- 
tion. 


This means, if ((x.), (xg)) is P-measurable and integrable, that (a) when- 
ever isnot in some exceptional set of P-measure 0, $((x2), 
for (%a,, * = x.y) fixed is measurable in terms of the meas- 
ure function P(xa,, -- - , Xa,; dés,,.-.,,) (that is, that the cylinder set over 
(xg,, defined by with (xa,, - ,%a,) =(x2,, is either 
in F, or differs from such a set by a subset of such a set on which P(22,, - - - , 


* This definition is due to Kolmogoroff (II, pp. 46-47). 
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vanishes) ; and that (b) the integral (3.6) exists and is E(xa,, Xap} 
A), if we neglect sets of P-measure 0. 

We shall suppose that p=q =1 to simplify the notation, and we can then 
drop the subscript 1 from a and 8. We shall suppose that P(x,; As) is already 
defined to satisfy the conditions of the theorem. In order to avoid confusion 
we shall reserve the integral sign throughout this proof for actual integration. 

(i) According to a theorem of Kolmogoroff (II, pp. 48-49), if O(., xg) is 
P-measurable and integrable, then 


+o 
(3.8) E(xe;¢) = lim >> kAP(xe;\k SO < Mk +1)), 0,* 

(if \ approaches 0 taking on only a denumerable set of values), except per- 
haps for an x,-set of P-measure 0. The series in (3.8) converges absolutely 
for each value of \, except perhaps for an x,-set of P-measure 0. In particular, 
suppose that $(x., xs) is measurable with respect to F,. Then for fixed x., 
(x2, %3) becomes a function of xs which is measurable with respect to F. 
(cf. §2). The existence of the limit on the right, for a fixed value of x4, is ex- 
actly a condition that the integral 


24) PCa; de) 


exist; and in fact the limit is equal to this integral. Theorem 3.2 thus follows 
from Kolmogoroff’s result for a function which is measurable with respect 
to F.,. 

(ii) Let @o(x.2, xs) be the characteristic function of a set Ao in the field F., 
of P-measure 0, and let A(é), ( « X), be the cylinder set over xs determined by 
the condition ¢o(é, xs) =1. Then, neglecting x,-sets of P-measure 0 and using 
(i), we obtain 


that is, P(x.; A(x2))=0. This same result will be true even if Ao is only sup- 
posed P-measurable, since it can then be enclosed in a set Ag’ , in the field F., 
which is of P-measure 0.{ From this it follows that if ¢(«., xs) is any function 
which vanishes except perhaps on an (%q, xs)-set of P-measure 0, then 


* The function P(x; AkSg@<A(k+1)) is the conditional probability, for given xg, that 
ARSo<A(k+1). Kolmogoroff’s result does not require the hypotheses of Theorem 3.2. 

t Cf. §2. The result is a generalization of the fact that if E is a Lebesgue measurable set of meas- 
ure 0, in two-dimensional (x, y) space, the intersection of E with the line x=c will be of (one-dimen- 
sional) Lebesgue measure 0 for almost all values of c. 
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E(2., = f %5)P( dep) = 0, 


if we neglect an x,-set of P-measure 0. 
(iii) Let ¢(x., xs) be any P-measurable and integrable function depending 
only on x4, Xs. Then it can be expressed in the form 


(Xa, Xs) = Xs) + O1(%a, 


(cf. §2), where ¢o(x., %s) vanishes except perhaps on an (x4, xg)-set of P-meas- 
ure 0, and where ¢;(x., %s) is measurable with respect to F,. Combining this 
fact with the results of (i) and (ii) we see that (neglecting x.-sets of P-measure 
0), for fixed x, =%.°, $(%.", xs) is measurable in xs with respect to the measure 
function P(x,°; deg), and that 


f xe, dts) = f %9)P( dep) = E(%0; = 4), 


as was to be proved. Conversely it is readily seen that if (3.6) exists as an 
integral, except possibly for an (x.)-set of P-measure 0, and if the function of 
(Xai) * * » thus obtained is integrable, then o((x.a), (xs)) is itself integrable, 
and the original definition of conditional expectation is applicable. 

According to Theorem 3.1, the hypotheses of Theorem 3.2 are always sat- 
isfied if F, (and therefore F.) is the Borel field determined by a denumerable 
collection of its sets. This will be true, for example, if X is euclidean space of 
N =1 dimensions, and if F, is the field of Borel sets of X. A stronger statement 
can be made, however, since if ¢; is measurable with respect to F.,, there is 
always (cf. §2) a Borel field F,(¢:) of X-sets, (depending on ¢,), which is the 
Borel field determined by a denumerable collection of its sets, such that if 
F..(g:) is the Borel field of 92-sets defined in terms of F,(¢,), as F., is defined 
in terms of F,, then ¢; is measurable with respect to F.,(¢:). Theorem 3.1 then 
shows, since only sets of F,(¢:) (or F..(¢:)) are involved, that (3.6) can always 
be interpreted as integration,* if we define P(x.; Ag) in a way depending on 
the function ¢ under consideration. 

The following theorem, proved by Kolmogoroff (II, pp. 47-48), is stated 
for future reference: 

THEOREM 3.3. If a, B, y are distinct integers, and if 6(%a, Xs, X_) is P-meas- 
urable, then 


(3.9) E(xa;¢) = E[xa; E(xa, xa; ¢)]. 


* The transition from the P-measurable function ¢ to the function ¢:, measurable with respect to 
F.,, is made as in (ii) above. 
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In particular, if $ is the characteristic function of a P-measurable cylinder set A, 
over 


(3.10) P(xa;A) = E[xa; P(xa, xs; A)| = ff Pee xg; A)P(xa; deg) .* 


The following theorem will be useful: 
THEOREM 3.4. Let $(%a,, °° * ,Xa,) be a P-measurable integrable function. 


Then 
ff ear = f Pleas dea) f 


(3.11) 


This theorem can be considered as a corollary to the preceding one, but a 
direct proof will be given by induction. If p=1, (3.11) becomes 


ff = f 


which is certainly true. Suppose that g>1 and that the theorem is true 
for p<q. In (3.11) (with p=q), the first (symbolic integration) gives 
* Since the theorem is supposed true for p=q—1, the 
right side of (3.11) then collapses to 


f Fe, ¢)dP, 


and this is equal to the left side of (3.11) by the definition of conditional ex- 


pectation. 
Most stochastic processes which have been discussed in the mathematical 
literature are Markoff processes, that is, processes which satisfy the following 


* The last expression is only symbolic for the second one unless the conditional expectation con- 
cerned can be expressed as an integral. Theorem 3.3 is true and will be used below in a somewhat 
more general form obtained by considering three groups of subscripts, an a-group, a 6-group, and a 
-group, to replace a, 8, y. 

t This expression is to be evaluated from right to left. If the expression can be considered as an 
iterated integral, the theorem becomes the generalization of Fubini’s theorem (on the evaluation of a 
multiple integral by means of iterated integrals) to the most general measures on product spaces on 
which the field of measurable sets is defined by starting with the sets which are direct products of 
measurable sets in the component spaces. In the case considered in Fubini’s theorem, P(%a,, °° * , Xa; 
dea ;,,)= P(dea;,,),21. Lévy (I, p. 73) obtains the generalization (where the product space is n-di- 
mensional euclidean space) by a change of variable which reduces the result to Fubini’s theorem. 
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condition: If a <8, and if A isa P-measurable cylinder set over %41, Xs42, °° - 
then, except perhaps for an (x4, %a41, ,%)-set of P-measure 0, 


P( xa; Xp; A) P( x3; A).* 


It follows at once that if ai<az< -- - <a,, and if A isa P-measurable cylin- 
der set Over %a,41, Xa,+2, ‘°°, then if we neglect an (%4,,---, Xa,)-set of 
P-measure 0, , %a,; A) =P(xa,; A). If and if A is a cylinder 
set Over Xs, Xg41, , (3.10) implies that 


(3.12) P(%a; A) = ff Poss A) deg) 


for a Markoff process. Markoff processes are sometimes carelessly discussed 
in the literature as if they were the general case, as if (3.12) followed from the 
definition of probability. 

4. Probability measures in terms of the conditional probability functions. 
In §3, the conditional probability functions were derived from the measure 
relations of a stochastic process. In this section the converse problem will 
be discussed. It will be seen that more is supposed below to be true of the 
conditional probabilities than is true in the general case, but the hypotheses 
are wide enough to cover the applications to be made. 

Let F., F. be fields as described in §2. Suppose that for every pair of in- 
tegers m, n, with m <n and cylinder set A,4: in the field F, over X,41, a func- 
tion P(#m, - ,%n;An41) is defined and has the following properties: 

(i) For fixed xm, - -,%n,P(%m, +, %nj;An4i)is a probability measure on 
the field of sets An4. 

(ii) For fixed P(am, %n;An41) is measurable with respect to F... 

Let Q(A) be a probability measure defined on the field of cylinder sets 
of F., over xX». There is then, as we shall now show, a uniquely determined 
probability measure defined on the cylinder sets of F., Over Xm, Xm41,° 5 
having the given functions P(xm, - - - , Xn; An41) as its conditional probability 
functions and equal to Q(A) if A is a cylinder set of F,, over x». If ¢ is the char- 
acteristic function of any cylinder set of F,, over %m, , %n, P(A) is defined 
by an iterated integral 


P(A) = f Olden) ff dems) dems) f 


(4.1) 


* A detailed discussion of the physical meaning of a Markoff process is given in Kolmogoroff (I). 
+ Cf. P. Lévy (I, pp. 121-123). 
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To show that this defines P(A) uniquely it is necessary to show that if A is 
also a cylinder set Over Xm, : - - , Xn’, (w’ #m), the expression (4.1) and the cor- 
responding expression 


P(A) = f Olden) dems) dems) f 


are equal. We can suppose without restricting generality that n’ >n. Then 


P’(A) = f f f * ** » Santi des) 


and since ¢, can depend only on x», ---, %,, the first integration gives 
P(%m, * 2) (which is identically 1) multiplied by ¢,. Similarly the 
next integrations, up to the integration over x,, give da; hence P’(A) =P(A). 
Evidently P(A), as thus defined, is a probability measure on the field of 
cylinder sets of F, over any finite set of coordinates with subscripts at least 
equal to m. It then follows from Theorem 1.1, as applied to the case where -4 
is the set of integers m, m+1, - - - , and where X, F, are as here given, that 
the domain of definition of P(A) can be extended to include all the cylinder 
sets of F, over %m, %m41, ‘°° in such a way that the extended set function is 
a probability measure. Since if A, M are respectively cylinder sets over *n41 
and Xn, - - , Xn, with characteristic functions om, 


A)P(dem,....n) = f Pm, +, P(dem,...,n) 


M 


= P(A-M); 


the conditional probability functions determined by P(A) are actually the 
given ones. The set function P(A) thus exists and it is uniquely determined 
by Q(B) and the given conditional probability functions, since (4.1) holds for 
P(A) either as a definition or as a theorem, because of Theorem 3.4. 
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What made this problem simple was the fact that the given conditional 
probability functions were entirely independent of each other; that is, there 
were no necessary relations between the given functions. This was possible 
because only cylinder sets over xm, Xm41, -- * Were being considered, for m 
fixed. If m is not to be kept fixed, the set of conditional probability functions 
can no longer be chosen independently of each other. We shall only consider 
the problem in detail for conditional probability functions corresponding 
to Markoff processes. In the treatment just given, if we had supposed that 
P(xm,* + +, %n; Angi) depended only on x,, the resulting process would have 
been a Markoff process. To extend the results to the consideration of all the 
sets of F, we shall need the following lemma: 


Lemma 4.1. Let {Qy(E)} be a sequence of probability measures defined on 
the sets of some Borel field S of sets of an abstract space. Suppose that 


(4.2) E,26,2---, Il €& =0 


implies that 
(4.3) lim Qv(E,) = 0 


uniformly in N. 
(i) If 
(4.4) Qn(E) = 


exists for every E in S, the set function Q(E) is a probability measure. 

(ii) If S is the Borel field determined by a denumerable collection of its sets, 
there is a subsequence {Qvy,,(E)} of {Qn(E)} converging to a limiting probability 
measure. 

This lemma can be considered as a generalization of Helly’s theorem.* Its 
proof will only be sketched. 

Proof of (i). We need only prove that 


(4.5) E= > = 0, 


implies 


(4.6) Q(E) = (En), 


* Sitzungsberichte der Akademie der Wissenschaften, Vienna, class IIa, vol. 121 (1912), p. 286. 


1 
1 
1 
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or (since Q(€) is obviously additive) that 


v+1 


Now 


---, En =0, 
1 2 N=w=l1 N 
so that the hypotheses of the lemma imply (4.7). 

Proof of (ii). Let E, E,’, - - - be a denumerable collection of sets of S, 
such that the Borel system of sets determined by the sequence {€,, } is S. 
By a familiar procedure, we can find a gequence of integers {N,} such that 
lim,..«. Qv,(€) exists for every set E of S which is in the field of sets deter- 
mined by the sequence of sets {€,’}. The hypotheses then imply that 
lim,.. Qv,(€) exists for every set € ¢ S,* and the remainder of part (ii) then 
follows from (i). 

THEOREM 4.1. Suppose that F, is the Borel field of sets determined by a 
denumerable subcollection of its sets, and that, for every pair of integers m, n, 
with m <n and cylinder set A in F, over Xn41, a (conditional probability) function 
P(Xm, +++, %n3 A) is defined which has properties (i), (ii) given above, and for 
which also 


(4.8) P(%m,°** %n; A) = P(4%n; A). 


Suppose that for each fixed value of n, whenever Ax, Ac, - -- is a sequence of 
cylinder sets of over Xn41 satisfying 


(4.9) 


it is true that 
(4.10) lim P(sn; A») = 0 


uniformly in x,. Then 
(i) there is a Markoff process with the given conditional probability func- 
tions, and 
(ii) if %n=Xn41 implies 


(4.11) P(%n; A) = P(n41; TA), 


* This can be proved by transfinite induction. 
+ The transformation T was defined at the end of §2. 
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there is a Markoff temporally homogeneous process with the given conditional 
probability functions. 

Proof of (i). Let Q(Z) be any probability measure defined on the sets of 
F,. Then if M is any cylinder set of F., over Xn41, , X, With charac- 
teristic function ¢m, define Py(M) by 


(4.12) Py(M) = ff devs) f ff den). 


It was shown above that this determines Py(M) uniquely. In the following, 
if M is any cylinder set over a finite number of coordinates Xm, %m4i1, °° * 5 Xny 
we define P(%m—1; M) by 


(4.13) P(%m—1; M) = f den) f cee 


where $y is the characteristic function of M. 

Now let m, n be any two integers with m <n. The cylinder sets of F., over 
Xm, ** * ,%, constitute a Borel field F,,,, determined by a denumerable sub- 
collection.* Suppose that Aj, As, - - - are sets in the field F,,,,, and that 


J] 4. = 0. 
1 


Then 
(4.14) lim P(xm—1; A») = 0, 


for all «1. If e€>0, and if M, is the cylinder set over x,,_; on which there is 
a value of u=v such that P(x,,_;; A,) >e, then it follows from (4.14) that 


Mi2M.2---, J] M,=0. 
1 


It follows from the definition of M, and from the fact that the conditional 
probability functions are less than or equal to 1, that 


(4.15) P(%m-2; Av) = f P(%m—1; Ay) P(%m-2; dém-1) S € + P(%m-2; M,). 


The hypotheses of the theorem imply that 
lim P(xm-2; M,) = 0 


uniformly in x,,-2. If vo is chosen so large that 


* Cf. §2. 


vo @ 
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for all xn—2, it follows from (4.15) that 
P(xm-2; Ay) < 2e, v 


for all Then if N <m—2, and if y>vo, 
Py(A,) = f dew.) ff Ay) P(x%m—3; d@m-2) < 2e; 


so that if the field F,,,, is identified with the field S of Lemma 4.1, the 
hypotheses of the lemma are satisfied. There is therefore a subsequence 
{Py,(A)} of {Py(A)} converging to a limiting probability measure (de- 
fined on the field F,,,,). Since m, m are arbitrary except that m<n, there 
is a further subsequence { Py, (A)} such that 
lim Py,,(A) = P(A) 

exists for every cylinder set A of F., over a finite number of coordinates. Since 
P(A) satisfies the hypotheses of Theorem 1.1, its domain of definition can 
be extended to include all the sets of F.,. If m, m are again any two integers 
with m <n, and if A is a cylinder set of F,, over x,4:, it was shown in the gen- 
eral discussion preceding the statement of Theorem 4.1, that (if V<m), 


f P( xn; A)dPy -f P(xy, Xn; A)dPy Py(AM), 
M M 


for every set M of F,,,,; which expresses the fact that the conditional proba- 
bilities at the Nth stage are the given ones. If N becomes negatively infinite 
only assuming values of the sequence { N,,,}, this becomes 


f P(xn; A)dP = P(AM), 
M 


so that the conditional probability functions of the new P-measure are the 
given ones. 

Proof of (ii). Suppose that (4.11) is satisfied, and let Q(Z) be as in the 
proof of (i). Let A be a cylinder set of F,, over a finite number of the coordi- 
nates %2, 23, - - - . Consider the sequence of set functions {Qy(A) } where 


1 N 
Qy(A) = WV f T™A)Q(de;). 


A slight modification of the argument just used shows that some subsequence 
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{Qw,,(A)} of {Qy(A)} converges for every such set A. Moreover, if P(A) is 
the limit, then 


Nmt+1 


P(TA) = lim > P(x1; T**A)Q(de:) = lim — P(x; TiA)Q(de,) 


m il m2 Nu 


1 Nm 
= lim — >> | P(x; T#A)Q(de:) = P(A), 
1 
so that if A is any cylinder set of F., over a finite number of coordinates, we 
can consistently define P(A) as P(T”A), where m is chosen so large that T"A 
is a cylinder set over 2, x3, - - - . The set function thus defined satisfies the 
conditions of Theorem 1.1, hence it can be extended to become a probability 
measure defined on all the sets of F... Evidently P(A) =P(TA) for every set 
of F., and, as in the proof of part (i), the conditional probability functions 
are the given ones. We have proved incidentally the following corollary: 


Coro.iary. In part (ii) of Theorem 4.1, if Q(E) is any probability measure 
defined on the field F,, there is an increasing sequence of positive integers 
Ni, No, such that 


lim > P(%1; T™A)Q(de,:) = P(A) 


exists for all cylinder sets A over a finite number of the coordinates x2, x3, -- - and 
determines a possible choice of the probability measure P(A). 


The proof becomes particularly simple if a value x; of x; is chosen, and 
if Q(Z£) is defined as 1 or 0 according as E does or does not contain x; . The 
integral then becomes P(x; ; TA). 

5. Examples. The examples discussed in this section are simple illustra- 
tive examples, all of Markoff processes, which will be studied in detail in §7. 

I. The type of stochastic process most frequently studied is that in which 
the chance variables form an independent set, that is, in which if Z,, - - - , Z, 
are sets of F, and a, - - - , a, are distinct integers, the P-measure of the 
Q-set determined by the m conditions x., € E;, (7 =1, - - - , m), is the product 
of the P-measures of the m sets determined by the single conditions. This case 
is characterized by the fact that P(x.,,---, %a,; A) does not depend on 
Xa, °° * » Xa, if A is a P-measurable cylinder set over coordinates not includ- 
ing %e,,° 

II. Let X contain m elements, the numbers 1, - - - , m. We shall define a 


* The corresponding P-measures on 2 have been examined by many writers, referred to in §1 
and §2. 
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Markoff temporally homogeneous process. Let (p;.) be an n? matrix of ele- 
ments which satisfy the conditions 


20, 


Pix = 1, j=i,---,m. 
k=1 

The element /;; is identified with the conditional probability that x,,, = if 
x,=j, v=0, +1, - - - . The P-measure will be completely determined if the 
probability p, that x,=k (which is to be independent of v) is assigned. The 
hypotheses of Theorem 4.1 (ii) are satisfied, so the existence of the “absolute 
probabilities” #1, - - - , ~» is assured. These satisfy (cf. equation (3.1)) 


(5.2) DL rite =1, k=1,---,m. 
j=1 1 


According to the corollary to Theorem 4.1, the absolute probabilities can be 
obtained in the form 


1 (m) 
(5.3) = lim pir (j fixed), 
vy m=1 
where the set Ni, No, - - - is an increasing set of positive integers, and ey is 


the conditional probability that x,4=k if x,=j. The element p{”” is deter- 


mined (cf. equation (3.12)) by 


(m+1) (m) (1) 
(5.4) Pick = Pix, Pie = Pu. 


Evidently the matrix (p{”) is the mth power of the matrix (p,,), and its 
elements satisfy (5.1).* 

III. Let X be arbitrary, but suppose that F, is the Borel field of sets de- 
termined by some denumerable subcollection. A non-negative completely ad- 
ditive set function (not necessarily always finite-valued) is supposed defined 
on X,j and the integral, with respect to this measure, of an X-measurable 


* This classical Markoff process, the one originally studied by Markoff, is discussed by Hostinsky 
(II), who gives an extensive bibliography. References to more recent work will be given in §7. 
Fréchet has announced a new book on Markoff processes in which he will probably study this case 
in detail. 

t It is supposed that there is a monotone increasing sequence of sets, each of finite X-measure, 
whose sum is X, and that the X-measure of any X-measurable set is the limit of the X-measure of its 
intersection with the sets of the sequence. We shall suppose that this set function is extended as 
usual so that it is defined (and 0) on the subsets of sets of F, on which it vanishes. The sets for which 
the extended set function is defined will be called X-measurable, and X-measurable functions are then 
defined in the usual way. 
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function f(x) over an X-measurable set E will be denoted by /,f(x)dx.* 

Let X XY be the product space of pairs (x, y), (x, y e X). A measure can 
then be defined on X XY by the condition that if Z, F are X-sets in F,, the 
X X Y-measure of the set determined by x ¢ E, y € F is the product of the 
X-measures of E and F.t+ Let p(x, y) be a function defined on X X Y-space 
which is measurable with respect to the measure just defined,f and which 
satisfies the following conditions: 

(a) p(x, y) is non-negative; 

(b) p(x, y) is integrable in y for fixed x, and 


(5.5) f 20, y)dy = 1.§ 


If A is a cylinder set over x,,; determined by the condition x,4; (EZ Fz), we 
define P(x,; A) by 


(5.6) A) = f p(x, addy, 0,4 
E 
By Theorem 4.1 (ii) these conditional probability functions are those of a 


temporally homogeneous Markoff process if, whenever £,, E2, - - - are sets in 
the field, 


(5.7) E,\2E,.2---, [[ =0 
1 


implies 


(5.8) lim p(x, y)dy = 0 


EZ. 


uniformly in x. This can be interpreted as the uniform (in x) integrability of 
p(x, y) with respect to y, that is, the uniform (in x) absolute continuity (in £) 


* It will be supposed, as usual, that integrability means absolute integrability, and that a non- 
negative function is integrable if and only if its integral on the sequence of sets in the preceding note 
is bounded. In many applications, X is supposed to be a Borel set E of n-dimensional euclidean space 
and F, the field of Borel subsets of €; and the set function is supposed to be Borel measure. 

+ Saks, Théorie de l’Intégrale, Warsaw, 1933, pp. 257-263. As usual we suppose that the measure 
is further extended so that subsets of sets of measure 0 are measurable and of measure 0. 

¢ We shall suppose further that p(x, y) is measurable with respect to the XX Y measure, as 
defined before its extension described in the preceding note, so that p(xo, x1) is measurable with re- 
spect to F.,, considered as a function defined on @. In any case p(x, y) will be equal to such a function 
almost everywhere on X X Y-space. It then follows (cf. Saks, ibid., p. 258) that p(x, y) is X-measura- 
ble in x (y) for each fixed value of y (x). 

§ As in the previous sections, when no region of integration is explicitly prescribed integration 
will be over the whole space. 
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of the set function {,p(x, y)dy. The condition will be satisfied if p(x, y) <¢(y), 
for all x, y, where ¢(x) is X-measurable and integrable over X. If the X-meas- 
ure of X is finite, that is, if {1dx < ©, the condition will be satisfied if p(x, y)? 
is integrable in y, and if there is a number K such that for every value of x, 


fo s 
If (5.7) implies (5.8), the measure function P(A) given by Theorem 4.1 be- 


comes, on the cylinder sets over 21, a function Q(£) of sets E ¢ F,; and if A is 
determined by the condition x « E, Q(Z)=P(A). Moreover (cf. equation 


d y)dy = O(E). 
(5.9) Sows nay = 


If the X-measure of E vanishes, (5.9) shows that Q(£) =0, that is, Q(£) is 
absolutely continuous. There is then an X-measurable function p(«) for which 


Q(E) = J eas, 


for all sets E ¢ F,.* This function p(x) satisfies the equation 


(5.9) J f vay = f 

E 
so that, if the order of integration is interchanged (p(x, y), p(x) are non- . 
negative), 


J f = ff 
Then 


(5.10) certo, » - = 0. 

Since E is arbitrary, (5.10) implies 

(5.11) = 26) 

for almost all y. The function p(y) can now be changed on a set of X-measure 


0 to make (5.11) true for all y. According to the corollary to Theorem 4.1, 
the “absolute probability density” p(y) can be obtained in the form 


* Saks, Théorie de l’Intégrale, Warsaw, 1933, p. 257. 
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1 


Ny 
| (x fixed), 
N, 1 E 


(5.12) = lim 

E 
where i, No, --- is an increasing sequence of positive integers, and 
Spb” (x, y)dy is the conditional probability that «,,, ¢ E if x,=x. The func- 
tion p(x, y) is determined (cf. equation (3.12)) by 


(5.13) p(x, 9) = ple, 9) = fF 
Evidently the function p(x, y) satisfies the conditions (a), (b) imposed on 


p(x, y). Moreover if the sequence of sets satisfies (5.7), and if (5.7) implies 
(5.8), then 


f ndy = fay f 
E, Ey, 


= f f 0, 
BE, 


uniformly in «, so that p(x, y) satisfies the condition of uniform integra- 
bility if p(«, y) does. A slight modification of the proof shows that if, for some 
integer u21, p(x, y) satisfies the condition of uniform integrability, the 
function p™ (x, y) for m>y will also satisfy the condition; and then a suitable 
modification of the proof of Theorem 4.1 (ii) will show that it is sufficient to 
assure the existence of an absolute probability density (given, for example, by 
(5.12)) to suppose that for some m, p(x, y) satisfies the uniform integrability 
condition. 

The Markoff process considered here is very general.* Example II is a 
special case. To show this we need only define X-measure suitably and define 
the function p(x, y) in terms of p;,. The space X has points denoted by the 
numbers 1, - - - , #. If £ is any set of X containing r elements, define the 
X-measure of E as r. The function p(x, y) is defined as pj, for x=7, y=k. 
More generally we can consider a space X whose points are the numbers 
1,2, ---. The field F, is to be the field of all subsets of X, and the X-measure 
of a set is the number of points in it. A matrix (p;x) is given whose elements 
satisfy 


(5.14) pix 2 O, > Pe = a. 


k=l 


* Hostinsky discusses this type in (I) and (II) and gives an extensive bibliography in (II). Cf. 
also the forthcoming book by Fréchet. Further references will be given in §7. 
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The condition of uniform integrability becomes here the condition of uniform 
convergence in (5.14): 


N 
(5.15) lim >) px =1 
fol 
uniformly in 7. If the condition is satisfied, absolute probabilities p;, po, - - - 
exist satisfying the conditions 


(5.16) ~;20, jf21; 
j=1 1 

Let go, 91, : - - be a sequence of non-negative numbers whose sum is 1. Sup- 

pose that p;,=O if k<j,and pj. =qx_; if k=>7. If go<1, it is readily seen that no 

process exists, temporally homogeneous or not, having the given conditional 

probabilities.* A particular case in which this is obvious is obtained by setting 

1. 

IV. The following example is again that of a temporally homogeneous 
Markoff process. The space X is arbitrary, but we suppose that a probability 
measure is defined on the field F,, and that a transformation Sx is defined 
on X which is one-to-one, takes X-measurable sets into X-measurable sets, 
and is X-measure preserving. If A is a cylinder set of F,, over x,41, determined 
by the condition x,,; e EZ, define P(x,; A) by 


P(x; A) =1 if Sx, 
P(x,; A) = 0 if Sx,¢ EL. 


The condition of Theorem 4.1 is not satisfied, but there is nevertheless a 
temporally homogeneous Markoff process with these conditional probability 
functions, for if M is an Q-set determined by the conditions x,, ¢« Ej, 
(j=1,- -- , p), the P-measure of M can be defined as the X-measure of the 
set - - (S-#7E,). 

6. Temporally homogeneous processes. A stochastic process suggests the 
transformation idea in its very phraseology; for example, “the conditional 
probability that x, belong to a set E if x) =x0”; and in §1 an explicit trans- 
formation T was defined to exploit this suggestion. In this section we shall 
consider only temporally homogeneous processes (for which T is measure-pre- 
serving). The theory of temporally homogeneous processes uses to a large ex- 
tent the terminology of the theory of measure-preserving transformations, 
and in this section we shall see that this has a complete justification, in every 

* This statement refers only to a process corresponding to a sequence of chance variables 


* , %1, Xo, %1,°* +. The results of the preceding section show that the statement is not true if 
processes corresponding to a sequence of chance variables x, x2, - - - are being considered. 
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detail, through the mediation of the transformation T.f The present section 
will then essentially be an independent study of the measure-preserving trans- 
formation 7, with particular stress on the case where the P-measure satisfies 
the conditions imposed on the P-measure corresponding to a Markoff process. 
We shall apply the theory of measure-preserving transformations, as devel- 
oped by Birkhoff, Koopman, and von Neumann. 

Suppose that a given process is temporally homogeneous. The ergodic 
theorem gives the following result.f{ 


THEOREM 6.1. Let the given process be temporally homogeneous, and let A 
be any P-measurable set. 
(i) If dw) is P-measurable and integrable, there is a P-measurable function 
o*(w) such that 
1 
(6.1) lim — ¢(T"w) = $*(w) 
1 


Noo 
almost everywhere on Q. In particular there is a P-measurable function Q(w; A) 
such that 
1 N 
(6.2) lim — > P(xm; T™A) = Q(w; A) 


Now. 1 


almost everywhere on Q. 
(ii) If the process is aMarkoff process, and if Ais any P-measurable cylinder 
set over X,, X»41, for some integer v, 


N 
(6.3) lim > P(x0; T™A) = P* (x0; A) 


1 


exists almost everywhere on Q; that is, except possibly on an x>-set of P-measure 0. 


The fact that the limit exists in (6.2) is apparently new. Results closely 
related to (ii), with more restrictive hypotheses on the conditional probability 
functions,§ have been proved by Fréchet and (jointly) by Kryloff and 
Bogoliotboff. 


t Conversely, as was seen in example IV, a measure-preserving transformation gives rise to a 
certain (Markoff) temporally homogeneous process, which is necessarily of a very special type. 

t The form of the ergodic theorem used here (due to Birkhoff) is the following: If Tw is a measure- 
preserving transformation of an abstract space , then part (i) of the following theorem (Theorem 
6.1) holds. A simple proof was given by Khintchine, Mathematische Annalen, vol. 107 (1933), pp. 
485-488. The function ¢(x, N)/N of Khintchine’s proof corresponds to the average in (6.1). For a 
complete treatment of the ergodic and related theorems see E. Hopf, Ergodentheorie, Ergebnisse der 
Mathematik, vol. 5, no. 2, which appeared so late that detailed reference to it could not be made in 
this paper. 

§ The only restriction on the conditional probability functions made in Theorem 6.1 is that 
there should actually exist a corresponding temporally homogeneous process; that is, that there 
should exist “absolute probabilities.” Exact references will be given in §7. 
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Proof of (i). The first part of (i) is a restatement of the ergodic theorem. 
The second part of (i) is an application of the first part, with ¢(w) =P (xo; A). 

Proof of (ii). Suppose that A is as described in (ii). Then if m> —v, it fol- 
lows from equation (3.12) that 


ff Pen T™A)P(xo; de) = f ™A)P(x0; = P(%o; T™A), 
neglecting sets of P-measure 0, so that 


f lim Pim: de) = lim — T™A)P(xo; de) 


N- ow N-o N 


= lim — > = > T™A), 


|v|+1 


neglecting xo-sets of P-measure 0. 


Corotiary 1. If the transformation T is metrically transitive,t then 
= f A) = A) = PCA) 


almost everywhere on Q. 


This corollary is merely a rephrasing, pertinent to the case being consid- 
ered, of the ergodic theorem for metrically transitive systems.§ 


CorOLiaRy 2. If F, is the Borel field determined by a denumerable collection 
of its sets, and if there are no angle variables,|| then 


{ If the first two of the above expressions are considered as actual integrals, the admissibility 
of the transition from the first to the second follows from Lebesgue’s theorem on the admissibility 
of term by term integration of a uniformly bounded convergent sequence of measurable functions. 
However, even if the integrals are considered merely as symbols for conditional expectations, the 
proof of Lebesgue’s theorem can be extended to this case. 

t Metric transitivity means here that no P-measurable set of measure not 0 or 1 is invariant 
under 7. If there is a P-measurable set A of measure not 0 or 1, which is invariant neglecting a set of 
P-measure 0, that is, if TA=A-+-Ao—Ao’, where P(Ao)=P(Ao’)=0, then the P-measurable set 
>-*..7"A has measure P(A) 0, 1 and is invariant under 7, so there cannot be metric transitivity. 
Hence the content of the definition is not changed if invariance up to a set of P-measure 0 is sub- 
stituted for actual invariance. 

§ Cf. Khintchine, ibid., p. 488. 

| An angle variable is a complex-valued P-measurable function ¢(w) such that |¢|>0 on an 
Q-set of positive P-measure, and that 

= co), (|e| = 1, c#1), 
almost everywhere on ©. If the transformation is metrically transitive, the invariance of | ¢(w)| under 
T implies that | ¢| =const. almost everywhere on Q. (Cf. B. O. Koopman, Proceedings of the National 
Academy of Sciences, vol. 17 (1935), pp. 315-318.) 
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(6.4) lim P(MT™A) 

exists, when m is restricted to a certain increasing set of integers of measure 1,* 
independent of the sets A, M which can be any P-measurable sets. If there is also 
metric transitivity, the limit in (6.4) is P(A)-P(M). 

Conversely, if the limit in (6.4) exists, for all P-measurable sets A, M on 
some set of integers of measure 1, there are no angle variables; and if the limit is 
P(A)-P(M), there is metric transitivity. 

This theorem was proved by Koopman and von Neumann in the metri- 
cally transitive case, for a one-parameter family of transformations {7;}, 
—« <t<o.f Their proof is applicable, with insignificant modifications, to 
the family, considered here, of transformations T,, = 7*.} 

Lemma 6.1. Let f(w) be any complex-valued P-measurable function, and let 
m,, M2,---+ be an increasing sequence of positive integers. Suppose that 
{f(T™iw)} and {f(T-™iw)} are sequences of functions convergent almost every- 
where on Q. Then if O is any open set of the complex plane, the Q-sets defined 
by the conditions 


(6.5) lim f(T iw) O, lim f(T-"iw) € O 


are respectively cylinder sets over x1, and X2,--- (if we neglect 
sets of P-measure 0). 

To any positive integer v corresponds (cf. §2) a P-measurable function 
f,(w) depending on only a finite number of coordinates, and having the prop- 
erty that 


(6.6) | fw) — | < 1/ 


except perhaps on an -set of P-measure at most 2-’. There is a subsequence 


* A set of integers a, a2, , (a1<a2< ), issaid to have measure 1 if 


lim > 1=1. 
mo a;<m 

ft Proceedings of the National Academy of Sciences, vol. 17 (1935), pp. 315-318. To extend 
their proof to the non-metrically transitive case, it is only necessary to allow a wider interpretation 
of their projection operator Eo. The hypotheses of topological character they impose on their space 
are unnecessary in this application. 

¢ We use the fact that the P-measurable complex-valued function whose absolute values squared 
are integrable, form a unitary space H when distance and inner product are defined in the usual way 
(cf. Stone, Linear Transformations in Hilbert Space, American Mathematical Society Colloquium 
Publications, vol. 15, pp. 23-29). The hypothesis that F, is the Borel field determined by a denumera- 
ble collection of its sets means that H is separable and thus is either finite-dimensional or a Hilbert 
space. The theorem (and proof) of Koopman and von Neumann is valid in the finite case also. 
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{us} of {m;} such that for each positive integer j, f;(T*iw) (f;(T-*iw)) de- 
pends only on x1, x-2,--~- x2, - Since T is measure-preserving, 

| — | < 1/7 
(| — | < 
(j fixed, except perhaps on an Q-set of P-measure at most 2-’. Then 

| f(T») — f,(T#w)|<1/N, v=N,N+1,--- 
(| f(T-*w) — f,(T-#w)| <<1/N, v=N,N+1,--- 


(6.7) 


(6.8) 


except perhaps on a set of P-measure at most 2-"+'; so that 
lim f,(T*w) = lim f(T#») 
(6.9) 


(im = im ) 


almost everywhere on ©. Then the sets defined by (6.5) are the same as the 
sets defined by the conditions 


lim f,(Tw) €O lim € O 


respectively, if we neglect sets of P-measure 0. This fact implies the truth of 
the lemma. 


Lema 6.2. The equality 
P( A) = P(%n; A)* 
holds almost everywhere on the space Q of a Markoff process, for any P-measur- 
able cylinder set A over Xn42,° °°. 


We need only show that, if M is a P-measurable cylinder set over 
Xa-1, Xa, then 


(6.10) J Pe. A)dP = P(AM). 


It is evidently sufficient to prove (6.10) for sets M which are cylinder sets 
over a finite number of coordinates. If M is such a cylinder set, over 
Xm, Xm4i1, °° * Xn, (6.10) follows from the fact that P(%m,---, %n; A) 
=P(x,; A) almost everywhere on @. 


* The conditional probability function P( - - - , %n-1, %n; A) is defined in the same way as the 
function P(%m, +--+ , Xn; A). Note that the finiteness of the set a1, - - - , ap was not used in the defini- 
tion of the latter function. 
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THEOREM 6.2. A temporally homogeneous Markoff process is metrically tran- 
sitive* if and only if there is no set E of the field F, such that if A is the Q-set de- 
termined by the condition x, ¢ E, (0<P(A) <1), and 

(i) A is invariant under T, if we neglect a set of P-measure 0, or 

(ii) if we neglect xo-sets of P-measure 0, 

then 


A) = 1, E 


(6.11) 
P(%; A) = 0, E. 


Proof of (i). If there is an invariant set A of the type described in the theo- 
rem, the process cannot be metrically transitive, by the definition of metric 
transitivity. Conversely, if the process is not metrically transitive, there is a 
P-measurable set M invariant under 7, and 0<P(M) <1. If f(w) is the char- 
acteristic function of M, f(w) =f(Tw) on Q so that 
(6.12) lim f(T"w) = lim f(T-"w#) = f(w) 
on 2. Then by Lemma 6.1, M can be considered either as a cylinder set over 
X_1,¥_2,: (when we denote it by or as a cylinder set over %, - 
(when we denote it by Mz), neglecting sets of P-measure 0. It follows from 


Os P(---, %-1, Ms) 1, 


f P( * 5 Xo; M2)dP = P(M, Me) = P(M)), 
M 


| 


(6.13) 


f P( 5 Xo; M.)dP = P(CM, Ms) = 0 
CM, 


that 
P( M2) = WE M,, 


(6.14) 
Mz) = 0, w¢ Mi, 


if we neglect sets of P-measure 0. Now, according to Lemma 6.2, 
P(.- ++, %-1, %0; Me) =P(x0; Me) almost everywhere on ©. Then if a set of 
P-measure 0 is neglected, M must be a cylinder set over Xo; this set is de- 
termined by the condition x» ¢ E, where E is the xo-set on which P(xo; Mz) =1. 
Since we can suppose (cf. §2), altering P(«o; Mz) on an 2 o-set of P-measure 0 
if necessary, that P(x); Mz) is measurable with respect to F., we can suppose 
that E is in F,. The Q-set determined by the condition 2 ¢ Z is invariant (up 


* If the transformation T is metrically transitive, or has angle variables, the same will be said to 
be true of the corresponding stochastic process. 
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to a set of P-measure 0); so that it is the same as any set determined by a 
condition x, ¢ E, up to a set of P-measure 0. 

Proof of (ii). If there is a set E, as described in the theorem, for which 
the hypotheses of (ii) are true, then 


P(T-'A; A) = f P(xo; A)dP = P(A), 
TA 


so that T-1A, and therefore A, is invariant under T up to a set of P-measure 0; 
and the process cannot be metrically transitive. Conversely, if the process is 
not metrically transitive, an invariant set A of the type described in part (i) 
exists, and (6.14) becomes precisely (6.11). 


THEOREM 6.3. Suppose that @(w) is an angle variable of a temporally homo- 
geneous Markoff process, so that 


(6.15) ¢(Tw) = co(w), [cl =1,c#1, 


almost everywhere on Q. 
(i) The function ¢(w) can be considered as a function of xo alone, namely, 
o(w) = (x0), so that (6.15) becomes 


(6. 15’) ¥(x1) = of(%o), 
and the possible exceptional set is an (xo, x;)-set of P-measure 0. 
(ii) If the hypotheses of Theorem 3.2 are satisfied, and if the conditional 


probability functions are supposed defined as described in the statement of that 
theorem, then for each value of Xo, 


(6.16) = const. = af(xo) 


on a cylinder set A(%o) over x, such that P(xo; A(xo)) =1 except possibly on an 
Xo-set of P-measure 0. 

(iii) If Y(%0) takes on any non-zero value on a set of positive P-measure, c is 
a root of unity. 

(iv) There exist P-measurable cylinder sets Ao, Ay over Xo, %1 respectively, 
determined by the conditions xo € Eo, x; € E,, such that 0<P(A;) <1, and if we 
neglect x-sets of P-measure 0, 


Ay) = 1, xo € Eo, 


(6.17) 
P(xo; Ai) = 0, xo ¢ Ep. 


(v) The function (x0), if it is integrable, satisfies the integral equation 


(6.18) f = of(%), 


; 
i 
i 
5 
H 
| 
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except possibly for an xo-set of P-measure 0.* 
Proof of (i). Suppose that (6.15) is satisfied. There is an increasing se- 
quence of positive integers m, 2, - - - such that 


lim = lim = 1. 


Then 
(6.19) lim ¢(T"w) = lim ¢(7-"w) = o(w), 


almost everywhere on @. If A(O) is the Q-set determined by the condition 
¢(w) €O (Oan open set of the complex ¢-plane), the method used in the proof 
of the preceding theorem shows that A(O) is a cylinder set over xo, neglecting 
an Q-set of P-measure 0. It follows readily from this that there is a P-measur- 
able function ¥(x»9), depending only on x» and such that ¢(w) =¥(xo) almost 
everywhere on 2. 

Proof of (ii). Let A(%o) be the cylinder set over x;, determined by the con- 
dition x; E(%o), on which W(x1) =af(x0). The (xo, x1)-set M, determined by 
the condition that x; e E(xo) for each value of xo, is of P-measure 1, and its 
measure can be expressed, according to Theorem 3.4, as 


where f,, is the characteristic function of M. Then P(xo; A(xo)) = 1 except pos- 
sibly on an x9-set of P-measure 0. 

Proof of (iii). If Y(xo) takes on a value ¥)+0 on a set Ao of positive P- 
measure, ¥(x9) must take on c"W> on a set A, of the same P-measure (since 
¥(x,) =c"p(xo)). The number c must then be a root of unity; for if not, the 
numbers Wo, co, - - - are all distinct, so that the sets Ao, Ai, - - - are all dis- 
junct. But this is impossible since then 


12 P( Las) = 
0 


Evidently the fact that the process is a Markoff process was not needed in 
this proof of (iv). 

Proof of (iv). If O is an open set of the complex y-plane, so chosen that 
the P-measure of the Q-set Ao, determined by the condition (x) «€ O, is posi- 


* The left side of (6.18) is the conditional expectation E(xo; y), and the conditions under which 
it can be considered an integral were considered in §3. The integrability condition imposed on y 
is unimportant, since if y is an angle variable, the integrable function yx, equal to y if |p| SK and 
otherwise equal to K, satisfies (6.15’) almost everywhere, so that yx is also an angle variable if K is 
chosen so large that |x| >0 on a set of positive P-measure. 


vow 
ve 
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tive and less than 1; and if A; is the Q-set determined by the condition 
c~W(x1) € O, then Ao =A, (if we neglect sets of P-measure 0). Equation (6.17) 
then follows at once. (Cf. the proof of Theorem 6.2 (ii).) 

Proof of (v). If ¥(%1) =af(xo) almost everywhere on Q, and if y is integra- 
ble, then 


f des) = f dex) = 


except possibly on an 2-set of P-measure 0, as was to be proved. 


THEOREM 6.4. A temporally homogeneous process for which the correspond- 
ing sequence of chance variables - -, x1, X0,%1, form an independent set 
(cf. §5, example 1) is metrically transitive and has no angle variables.* 


A process of this type is a very special case of a Markoff process, so Theo- 
rems 6.2 and 6.3 are applicable. A set A, as described in the statement of 
Theorem 6.2, is impossible, since in the case of independence P(x»; A) =P(A); 
and the process is therefore metrically transitive. For the same reason, there 
can be no sets Ao, Ai, as described in Theorem 6.2 (ii); and the process there- 
fore has no angle variables. 


THEOREM 6.5. Suppose the measure relations of a temporally homogeneous 
Markoff process have the following property: There is a function (xo; x1), meas- 
urable with respect to F., and integrable over Qin x, for fixed xo, such that (except 
possibly for an xo-set of P-measure 0), 


A 


whenever A is a cylinder set of F., over x1. 
(i) The process has no angle variables if and only if limn.. P(%o; TA) 
exists (except possibly on an x>-set of P-measure 0) for every such set A. 

(ii) The process has no angle variables and is metrically transitive if and only 
if limn.. P(xo; T"A) =P(A) (except possibly on an xo-set of P-measure 0) for 
every such set A. 

(iii) If it is true that P(%;A) (for A a cylinder set of F.. over x1) can be de- 
fined to be a probability measure, for each fixed value of xo, which vanishes identi- 
cally in xo for a given set A if it vanishes at all, then a function exists and 
satisfies the hypotheses of the theorem. 


Before proving the theorem, we shall give an example of a temporally 
homogeneous stochastic process which has no angle variables, and for which 


* This result was proved by Doob (I, pp. 761-763), and Hopf (I, p. 95). 


» 


} 
(6.20) P(ao; A) = f $( 1) P(dex) | 
| 
| 
| 
$ 
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the limit described in (i) does not exist. This example will therefore show that 
some condition, such as the existence of the function ¢ as described, is neces- 
sary in the theorem. The example is a particular case of example IV of §5. 
In example IV suppose that the transformation S is metrically transitive and 
has no angle variable. Then it is readily seen that the transformation T is 
metrically transitive and has no angle variable. On the other hand, ifA 
is a cylinder set of F, over x:, determined by the condition x, e Z, then 
P (xo; T™A) is 1 or 0 according as is or is not in E. Then lim,,... P(%0; T”A) 
exists almost everywhere on 2 only if S"xo, for large m, is finally always in E 
or never in E for almost all xo. This implies that Z is invariant under S (up 
to a set of X-measure 0), which is impossible, since S is metrically transitive, 
if we choose A so that P(A) 0, 1. 

Proof of (i). Suppose first that there are no angle variables and that F, is 
the Borel field determined by a denumerable collection of its sets. According 
to Theorem 6.1, Corollary 2, there is an increasing sequence of integers, inde- 
pendent of A, M, such that 


(6.21) lim = lim P(x1; T%A)dP = Q(M; A) 


exists, where A, M are P-measurable cylinder sets over x;. From this it follows 
readily that if f(x,) is P-measurable and integrable over Q, then 


(6.22) tim ff Plas = f ).* 


In particular (cf. equation (3.12)) 


lim T*A)o(%0, P(de:) = lim T*A)P(xo; de:) 


(6.23) 
lim T*vA) 


exists. We shall denote this limit by Q(x; A). Evidently 
K 
if K is a P-measurable cylinder set over the coordinates %m, %m4i, °° * 5 Xo 
(m <0). If €>0, there is an integer V = N(e) so large that if a;>N, 
| P(xo; — Q(x0; A)| €/6 


* The set function Q(M; A) is obviously additive in M for fixed A. Since Q(M; A) S$ P(M), Q(M; A) 
for fixed A is a completely additive function of sets M; hence integration with respect to the differ- 
ential element Q(de,; A) has a meaning. 


— 
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except possibly on a set A, such that P(A.) <¢/6. If M is a P-measurable 
cylinder set over x, and if a;<v, then 


(6.25) P(MT’A) = = J. 


P(%o; 
so that, if use is made of (6.24) and (6.25), 


| P(MT’A) — A) | 


IIA 


e/6 + 2P(A.) S 
if »y>a;>N. Then since 
1 xX 1X 
Q(T*M; A) = lim > >-P(T*MTiA) = lim P(MTi-*A) 


Noo lV j=1 Noo IV j=1 
is independent of k, 
(6.26) lim P(MT’A) = Q(M; A), 


so that (6.21) holds with a,=v. The proof that (6.21) implies the existence 
of the limit in (6.23) can now be used to show that the existence of 
lim,....P(MT’A) implies the existence of lim,.,, P(%o; T’A). The hypothesis 
that F, is the Borel field determined by a denumerable collection of its sets 
can now be removed; since if this is not true, we can preassign the set A, and 
then replace the field F, by a smaller field F/ for which the denumerability 
hypothesis is true, such that (cf. §2) the set A is in the corresponding field FJ , 
and such that $(%o, x:) is measurable with respect to F . 

Conversely suppose that lim,.., P(%o; T’A) exists for every set A, as de- 
scribed in the theorem. Then if M is a P-measurable cylinder set over x, 


(6.27) lim P(MT*A) = lim T’A)dP 

exists. Thus 

(6.28) lim f(x1)g(x,)d P* 

exists, if f(x:), g(a) are characteristic functions of cylinder sets of F,, over 1. 


The limit can then be shown to exist (using a familiar method of approxima- 
tion) if f(%1), g(1) are any bounded complex-valued P-measurable functions 


* If is a complex number, the notation £ will be used, as is customary, to denote its conjugate 
complex number. 


i 
ag 
re 
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depending only on x;. Now if there is an angle variable, there is, as was seen 
above, a bounded angle variable. If ¥(x:) is a bounded angle variable, we set 
f=g=vy in (6.28) and find that the limit 


lim | ¥(x)~(x,)dP = lim om f | =1,c 


must exist. This is absurd; hence there can be no angle variable. 

Proof of (ii). If the process has no angle variables, and if it is also metri- 
cally transitive, lim,... P(*0; TA), which we know exists for A a P-measurable 
cylinder set over x; by part (i), must be P(A) since (Theorem 6.1, Corollary 1) 


1 N 
lim — >> P(x; T™A) = P(A), 
Noo N 1 

if we neglect «o-sets of P-measure 0 throughout. Conversely if, whenever A is 
a P-measurable cylinder set over x, lim,.., P(«o; T*A) =P(A) except possibly 
on an xo-set of P-measure 0, the process can have no angle variables, accord- 
ing to (i). If the process is not metrically transitive, there is a P-measurable 
cylinder set A over x;, (0<P(A) <1), which is invariant under T (if we neglect 
a set of P-measure 0). Then 


A) = P(%o; P(A), 


that is, P(xo; A) =P(A) except possibly on an x o-set of P-measure 0. This is 
incompatible with (6.11). The process therefore has no angle variables and is 
metrically transitive, as was to be proved. 

Proof of (iii). We shall use the hypotheses of (iii) only to derive the fact 
that if A is a cvlinder set of F,, over x1, and if P(x»; A) =0 for some value of xo, 
then P(A) =0. This fact is obvious from the equation 


P(A) = f A)P(deo). 


Now consider the field of cylinder sets of F., over xo, x1. One probability meas- 
ure is already defined on this field, namely P-measure. We define a second 
probability measure P(M) for M in this field by 


(6.29) P(M) = f P(deo) f %1)P(dex), 


where f(%0, #1) is the characteristic function of M.* According to Theorem 
3.4, 
* This new measure is essentially a measure in two-dimensional (xo, x:)-space, obtained in the 


usual (multiplicative) way (cf. Saks, Théorie de l’Intégrale, Warsaw, 1933, pp. 257-263) from a given 
measure (P-measure) on the xo-axis and a given measure (P-measure) on the x-axis. 
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(6.30) P( M) = f P(deo) P(xo; de;) 


and the integration need not be taken symbolically. Let M(xo) be the cylinder 
set over x, defined by the equation f(x, x1) = 1. If we integrate in (6.30), then 
if P(M) =0, P(xo; M(xo)) =0, except possibly on an xo-set of P-measure 0. It 
has already been shown that for each value of x» = £ such that P(~; M(é)) =0, 
P(M(é)) =0. Then if P(M) =0, 


Bow) = = 05 


hence the set function P(M) is absolutely continuous with respect to P(M). 
There is therefore* a function (xo, x), measurable with respect to F., such 
that if M is a cylinder set of F., over xo, %1, 


(6.31) P(M) = f $(x0, %1)dP. 
M 


In particular if M is the intersection of Ao (a cylinder set of F., determined by 
the condition x ¢ Zo) and A (a cylinder set of F,, determined by the condition 
x1 € E,), (6.31) becomes 


P(Ac) P(A) = f P(de) P(x0; des), 
Ey BE, 
so that 


This equation is to hold for every set Zp in the field F., so that the quantity 
in the brace must vanish, except possibly on an x9-set of P-measure 0, as was 
to be proved. 


THEOREM 6.6. If the conditional probability functions of a temporally homo- 
geneous stochastic process satisfy the conditions 
P( x0; A) = AoP(A) 


(6.32) 


for every P-measurable cylinder set A over x;,{ where 0<d, S11, and if 


* Saks, ibid., p. 257. 
+ The inequalities are to hold with probability 1 for each set A. 


q 
4 


126. 
(6.33) 


then the process is metrically transitive and has no angle variables. 


If the process is a Markoff process, we can take \,=1, whenever v>0, 
leaving only the first inequality of (6.32) as an actual condition. Theorem 6.2 
gave a much more sensitive condition. 

Let A, be a P-measurable cylinder set over x1, , #n, #21. Then if fis 
the characteristic function of As, and if m=1 (cf. equation (3.10)), then 


oR f den) 


f 0; de) f 


fa — f)P(x-m,*** 5 %n—1; den) 


<-1 f AmAm-1 ees f des) f 


s1- eae) f des) f 


fa — f)P(x1, , den) 


— 


If A; is a P-measurable cylinder set over xm, - - - , Xo, (m20), then 


(6.35) P(AiAs) = P(x-m, As)dP {1 — A[1 — P(As)]}. 
Al 

Since this inequality is true for any sets A;, Az as described, it is true for any 

P-measurable cylinder sets Ai, Over Xo, X1, %2, respectively. 

Now suppose there is a function ¢(w), a complex-valued P-measurable func- 

tion which does not vanish almost everywhere on Q, and such that, for some 

constant c of modulus 1, 
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(6.36) o(Tw) = co(w) 
almost everywhere on 2. To prove the theorem, it is sufficient to show that 
$(w) is identically a constant almost everywhere on ©. Since 


¢(T’w) c’o(w), 1, 2, 
almost everywhere on QQ, if the integers m, m2, m3, --- are chosen so that 
lim,..c% =1, it follows that 
(6.37) lim = lim ¢(7T-""w) = o(w) 


almost everywhere on &. Let A=A(O) be the Q-set defined by ¢(w) e O where 
O is an open set of the complex ¢-plane. According to Lemma 6.1, (6.37) 
implies that A(O) can be considered (neglecting sets of P-measure 0) as a 


cylinder set over both x, %2, - - - , and x1, x2, - - - . Then in (6.35) we can 
take A, =A,=A, obtaining 
(6.38) P(A) S P(A){1 —a[1 — P(A)]}, 


which implies that P(A) =0, or that P(A) =1. Since O is arbitrary, this means 
that there is a constant ¢o such that ¢(w) =» almost everywhere,* as was to 
be proved. 

As an application of the theorems of this section, we shall show how to 
derive a theorem of Kolmogoroff (I, p. 425).{ Suppose that F, is the Borel 
field determined by a denumerable collection of its sets, f and that conditional 
probability functions are given, as in Theorem 4.1 (ii), except that instead 
of supposing that (4.9) implies (4.10), we suppose, with Kolmogoroff, the 
validity of the stronger condition that there is a number A, (0 <A <1),such that 
whenever A is a cylinder set of F., over %, 


(6.39) P(x; A) 2 AP(x0’; A) 


for all xo, x¢ . There is then, according to Theorem 4.1 (ii) a temporally homo- 
geneous Markoff process with the given conditional probability functions. 
From (6.36) (interchanging xo, x) we find that 


(6.40) = f A)Pe) ~ A) 


for all Then according to Theorem 6.6, with Axo =A, A1=A2= =0, the 


* The point ¢o of the complex plane is the intersection of the interiors of all the circles of rational 
radii with centers at points whose coordinates are rational and for which the corresponding Q-sets 
are of P-measure 1. 

+ The proof to be given cannot compare in simplicity or elegance with that of Kolmogoroff. 
It is only given to show the significance of Kolmogoroff’s hypothesis, (6.39) below, and the place of 
such a theorem in this development. 

t This hypothesis will be eliminated below. 
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process is metrically transitive and has no angle variables. Moreover the hy- 
potheses of Theorem 6.5 (iii) are satisfied, so that, according to part (ii) of 
that theorem, P(x 9; T7”"A)—>P(A) except possibly on an x-set of P-measure 0. 
We shall show that this exceptional set is actually empty. In the integral 
following 


(6.41) ff Pe T™A)P(xo0; = P(%o; T™A) 


we have just seen that the integrand converges to P(A) except possibly on an 
x,-set of P-measure 0. It follows readily from (6.39) that if P(A) =0, then 
P(x; A) =0,* so that the exceptional set is of P(%o; de,:)-measure 0 for each 
value of xo. Then term by term integration in (6.41) gives Kolmogoroff’s re- 
sult, that P(xo; T"A)—>P(A) for all 29. The assumption made above, that 
F, is the Borel field determined by a denumerable collection of its sets, is 
unnecessary, since in any case, if A is preassigned, F, can be chosen to satisfy 
the denumerability condition and the condition that A lies in F,. 

7. Application to the examples of §5. In this section we apply the results 
of §6 to a detailed study of the examples of §5. 

I. In this case, that of a sequence of mutually independent chance varia- 
bles, the conditional probabilities become absolute probabilities. If the proc- 
ess is temporally homogeneous, it is always metrically transitive and has no 
angle variables (Theorem 6.4). The ergodic theorem, as applied in Theorem 
6.1, gives the strong law of large numbers.f 

II. We have seen above (§5) that absolute probabilities pi, ---, p, al- 
ways exist in case IT, and can be obtained in the form 


Nv 


1 
(7.1) = lim — pn (j fixed). 


mal 


THEOREM 7.1. (i) Except possibly on an Q-set of P-measure 0, 


iz 
(7.2) lim — 


N-o m=1 


exists.§ 


* In fact P(A) >AP(x0’; A) for all xo’. 

t Kolmogoroff actually obtains more, since he obtains an estimate of the speed of convergence. 

t Cf. Doob (I, pp. 764-765); Hopf (I, p. 83); Khintchine (I). 

§ Part (i) supposes that some set of absolute probabilities is accepted, thus determining P-meas- 
ure. In (7.2), is a chance variable, a function of w: (+ + +, x1, xo, ) taking on the value at 
w if x» =r. Since only cylinder sets over 1, %2, - - - are involved in the theorem, the result holds when 
only the space of points (2x;, x2, - - - ) (on which P-measure is defined in terms of that on © in an ob- 
vious way) is considered. 
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(ii) If (j, k) is any pair of subscripts, there exists 


(7.3) lim — pik = Qin- 
N- N 


The existence of the limit in (7.3) was proved by Fréchet (I, p. 151) by 
means of an explicit determination of _, as a function of j, k, m, derived 
from the theory of linear difference equations. The proof given here will hold 
in the more general case III. It will be remembered that (7.3) is an integrated 
form of (7.2). 

Let fi, - - - , P» be some set of absolute probabilities corresponding to the 
given matrix. The first part of the theorem is simply the first part of Theorem 
6.1 (cf. equation (6.2)) in this special case. The proof of (ii) requires more care. 
We shall first choose a particular set of absolute probabilities 1, - - - , p, ob- 
tained by applying the corollary of Theorem 4.1 (ii), where we define the set 
function Q(£) to be equal to the number of points in E divided by n. This 
convention gives absolute probabilities defined by 


1 1 Nv n ia 
(7.4) fs — (Ler). 


According to Theorem 6.1, the limit in (7.3) exists for all pairs of subscripts 
j, k for which p;>0. Let J be the set of subscripts 7 for which ;=0. Then 
the limit in (7.3) exists if j ¢ J. We can write p%"*” in the following form: 


(7.5) pie = , 
lal 
and if we set 
1X wm 
= W Pix 
we obtain the relation 


v n 


Now according to (7.4), since p:=O ifleJ, 


ye N, m=1 j=1 leJ 


This implies that 


(7.8) tn tet 


j=l 


| 
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Moreover, it has already been shown that 


(7.9) lim = Qk, leJ, 


exists. Then letting u become infinite in (7.6), and using (7.9), we obtain 


(v)_ (u) (v) (v) 
(7.10) sup =limsup pi Wu + Sd pa + DO pin ge 


ee leJ 


and 


(7.11) lim inf = py qu, 
Now 


so that 


(7.12) lim sup — lim inf 
Noo leJ 


This inequality is true for y=1, 2, - - - , so that, using (7.8), we obtain 


(7.13) lim sup = lim inf 
as was to be proved. 

Since, in general, there is not a unique set of absolute probabilities 
pi, Pn, given matrix may correspond to several temporally homo- 
geneous processes. If all these processes are metrically transitive, the matrix 
(pix) will be called metrically transitive. If none of these processes has angle 
variables, the matrix will be said to have no angle variables. Otherwise the 
matrix will be said to be not metrically transitive, or to have angle variables, 
as the case may be. 


THEOREM 7.2. The matrix (pj.) is metrically transitive if and only if 
(i) there is a single set of absolute probabilities (pi, -- +, pn); or 

(ii) the limit qj. depends only on k; or 

(iii) the equations 


(7.14) = Xk, 


have only a single linearly independent solution in (x, - - - , Xn),* that is, the 
matrix (pix has rank n—1; or 


* This condition is not the same as that of (i) since the absolute probabilities are restricted to be 
non-negative. 


[July 
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(iv) the characteristic equation of the matrix (p;x) has 1 as a simple root; or 

(v) the matrix (pj) cannot be put in the form of Fig. 1 (where Ri, Re, Rs 
are square matrices and the 0’s represent blocks consisting entirely of 0-elements, 
in which Rs, but not R; or Re, may be absent, by means of some permutation 
applied to both rows and columns. 


Ri O O 

0 R. O 
Fig. 1 


It would be very difficult to give complete references to previous work on 
the various parts of this and the following theorems, and such references are 
perhaps made unnecessary by Fréchet’s forthcoming book. Since the time of 
Markoff, various writers have rediscovered and extended his results, inde- 
pendently of Markoff and of each other. It is hoped that this paper will 
provide a certain unity to these results, and it is claimed that the terminology 
used to describe the various cases is of more general validity and less ad hoc 
than that previously used. The methods, and some of the results, are new. 
Fréchet and Hadamard (I) have given a historical discussion of some of 
them. The equivalence of (i)—(iv) was shown by Fréchet (I) in the most de- 
tailed treatment of case II which has as yet appeared. The equivalence of 
(ii) and (v) is somewhat related to more specialized results of von Mises 
(I, pp. 533-549). The equivalence of (iv) and (v) (in a somewhat different 
form, with the additional hypothesis that no column of (;;,) contains only 
0 elements) was obtained by Romanovsky (I, pp. 154-155) by applying theo- 
rems of Frobenius. The matrix can be further decomposed if 1 is a root of 
multiplicity >2. As Romanovsky proves, and as follows readily here also, 
R,, R2 can be replaced by v boxes along the main diagonal, if v is the multiplic- 
ity of 1 as a root of the characteristic equation. A complete proof of each part 
of Theorem 7.2 will be given, since the method will be available for the treat- 
ment of case ITI, and the details of the latter case will then be omitted. 

Proof of (i). Suppose that the given matrix is metrically transitive, and 
let pi, - - - , Pn be a set of absolute probabilities corresponding to it. Then 
according to Theorem 6.1, Corollary 1, 


Qik = pr, 


if p;>0. If py, - - - , px isasecond set of absolute probabilities corresponding 
to the given matrix, then 


+ py), + pn ) 


132 J. L. DOOB [July 


is also a set of absolute probabilities corresponding to the given matrix, so 
that if p;+p/ >0, 


qik = 3(p + pr’), 
Combining these two results, if we choose jo so that p;,>0, 
= Pe = 3( pe + pe), 
that is, 
pe = pk, 


Thus metric transitivity implies that there is only a single set of absolute 
probabilities. 

Conversely, suppose that there is only a single set of absolute probabili- 
ties, pi, Pn. It can be verified directly that for each value of 7, gj1, - , Qin 
is a set of absolute probabilities corresponding to the given matrix,* so that 
dik = px for all 7, k. If the matrix is not metrically transitive, the process de- 
termined by the matrix of conditional probabilities (;,) and the absolute 
probabilities is not metrically transitive (that is, the corresponding trans- 
formation T is not metrically transitive), so that there is, according to 
Theorem 6.2, a set of subscripts K, such that 
(7.15) 0< pr, keK, > <1, 


keK 


(7.16) = 1, ke K, 


Then 
=1, keK, 


so that, using the fact that II”, = p:, and (7.16), we obtain 


= lim = 1, he X, 


eK eK 


contradicting (7.15). The matrix is thus metrically transitive. 


* In general, it can be shown that if g:, - - - , gn is a linear combination of columns of the matrix 
(qij), where the coefficients of the combination are non-negative and have sum 1, then qi, + - , dn 
is a set of absolute probabilities corresponding to the given matrix, and conversely every set of ab- 
solute probabilities corresponding to the given matrix can be obtained in this way. Cf. the discussion 
of case III below. 

+ The Q-set determined by the condition x ¢ K is invariant under T up to an xo-set of P-measure 
0. Equation (7.16) for m=1 is then the first equation of (6.11), and it follows for m>1 by direct 
verification in view of the definition of py 


kR=1,--+-,m. 
k=1,---,4%, 
m=1,2,:---, 
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Proof of (ii). If the matrix is metrically transitive, it was shown in the 
proof of (i) that gj.= x for all 7, k (where pi, - -- , p, is the uniquely de- 
termined set of absolute probabilities) and g;, therefore depends only on &. 
Conversely if g;.=q: is independent of j, we shall show that the absolute 
probabilities are uniquely determined by investigating the solutions of (7.14). 


Let (x1, - - - , %n) be a solution of (7.14). Then it can be verified directly that 

(7.17) Dd = x, 
j=l 

so that 
(N) 

(7.18) = x, k=1,:-+,n;N =1,2,-*-. 
j=l 


Letting m become infinite in (7.18) we obtain 


(7.19) = = Xk, kR=1,---,m. 
j=l j=l 
Then if (p1, - - - , pn) is a set of absolute probabilities, since fi, ---, p,is a 


solution of (7.14) and since ~i+ --- +f,=1, 


bi = qe = Pes k=1,---,m. 

Thus the absolute probabilities are uniquely determined; which fact implies, 
according to (i) that the matrix (;,,) is metrically transitive. 

Proof of (iii). If the system (7.14) has only a single linearly independent 
solution, the absolute probabilities (which constitute a particular solution) 
are surely uniquely determined, so the matrix is metrically transitive (ac- 
cording to (i)). Conversely, according to (i), if the matrix is metrically transi- 
tive, there is a unique set of absolute probabilities fi, - - - , Pn, and we have 
seen that = (7, k=1, - - -,m). Thenif (x, - - -,%,) is any solution of 
(7.14), it is linearly dependent on (fi, - - - , Pn) by (7.19). 

Proof of (iv). Since a set of absolute probabilities is a solution of (7.14), 
1 is always a root of the characteristic equation of the matrix (p;,). If the 
matrix is metrically transitive, there is only a single linearly independent 
solution of (7.14), according to (iii), and we shall show that 1 is a simple root 
of the characteristic equation of the matrix. Suppose the contrary. If all the 
columns of the matrix (~;,—4;,) are added to the first, every element of the 
first column becomes 1 —X. If 1—X is factored from the determinant, the de- 
terminant still vanishes for \=1, by hypothesis. Then the equations 


n n 
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(7.20) 
= Xe, 

j=l 
have a non-trivial solution (expressing the fact that the rows of the determi- 
nant are linearly dependent). If the last m—1 equations are subtracted from 
the first, the first becomes 


n 


j=1 
Thus - , #,) isa solution of (7.14), and since - - - +x,=0, it is not 
linearly dependent on (f;, - - - , pn); which contradicts the fact that there is 
only a single linearly independent solution of (7.14). 

Conversely, if 1 is a simple root of the characteristic equation of the ma- 
trix (pj), the system (7.14) has only a single linearly independent solution.* 

Proof of (v). If the given matrix is metrically transitive, we shall prove, 
using the fact that gj, =, for all 7, k (where pi, - - - , p, is the uniquely de- 
termined set of absolute probabilities), that the matrix (p;,) cannot be put 
in the form of Fig. 1, with R; and R, both present, by a transformation of 
the type described. (Such a transformation corresponds to a relabeling of the 
points of X.) If, on the contrary, the matrix can be put in this form, it is no 
restriction to assume that it is already in this form. It can then be verified 
directly that the iterated matrix (p), and therefore II, will also be in 
this form with the same blocks R;, R2. But then each column of the limiting 
matrix (q;x) =(pxa;) (with a;= - - - =a,=1) contains zeros, so that p= - - - 
= p, =0, contrary to fact. 

To show the converse we shall assume that the matrix is not metrically 
transitive and put it in the form described. Let fu, - - - , p, be a set of abso- 
lute probabilities for which the corresponding process is not metrically transi- 
tive. If any p’s vanish, we can assume they are the last ones: 


pi >, j 0O<asn, 
Pati = =P, =O. 


(7.22) 


Since 


PiPix = Pe, 


j=1 


* This follows from elementary matrix theory and does not depend upon the particular proper- 
ties of our matrix (pjx). 

+ This assumption implies the possibility of a matrix transformation of the type described in the 
theorem. 


[July 
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it follows that 
(7.23) pe=0, if k>a, 


where the inequalities on 7, k are to hold simultaneously. Because of the fact 
that there is not metric transitivity, there is a set of subscripts K such that 
p;>0 if j « K, that }>«~;<1, and that 


keK 


7.24 = 0, 
jeK, keK, j<a.* 


We can assume that K consists of the first a subscripts; then equations (7.23) 
and (7.24) describe the form of Fig. 1. Since K is not empty, R; cannot be 
absent; since >» «xp; <1, Re cannot be absent. If no p; vanishes, Rs is absent. 


THEOREM 7.3. The matrix (pj) has no angle variables if and only if 
(i) for every pair of subscripts j, k, 
(7.25) lim = gin 
exists; or 
(ii) 1 ts the only root of modulus 1 of the characteristic equation of px; or 
(iii) the matrix cannot be put in the form of Fig. 1 by a transformation of 


the type described in Theorem 7.2 (v), where Re, Rs may not be present, and 
where R, is itself in the form of Fig. 2, obtained by dividing the subscripts into 


v groups J;,---, J,, of consecutive subscripts, such that p;,=0, unless is 
in some one of the square matrices S;, - - - , S, for whichj ¢J,, k € Jr41, (r<v), or 
1 
Ji J2 Js Js 
0 0 
0 O |, y=4, 


J. Ss, 0 O O 
Fig. 2 


Proof of (i). Suppose that the matrix (p;,) has no angle variables. Let 


* Cf. Theorem 6.2. The Q-set determined by the condition xo e K is invariant under T, if we neg- 
lect a set of P-measure 0. The first set of equations in (7.24) is obtained from the first equation of 
(6.11), which implies that P(x9; CA) =0, (xo « E), except perhaps for an xo-set of P-measure 0, and the 
second set of equations in (7.24) is obtained directly from the second equation of (6.11). 

t The equivalence of (i), (ii) was shown by Fréchet (I, q.v. for earlier references). Romanovsky 
(I) has discussed matrices like that of Fig. 2. Cf. also Doeblin and Fortet (I, p. 1700). The latter 
authors omit mention of exceptional points, implying here the possible existence of Rs. 
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pi, - ++, Pn be a set of absolute probabilities corresponding to this matrix. 
The condition of Theorem 6.5 will be satisfied if, whenever r is a subscript 
such that p,>0, ~,, can be put in the form 


Prk = 


This will be possible if, for r fixed, p,,=0 whenever p, =0; but this is true, 
since 


Pid ix = pr. 

j=l 
Thus the condition of Theorem 6.5 is satisfied so that the limit in (7.25) 
exists for all 7, k for which p;>0. Then if J is the set of subscripts for which 
the absolute probabilities vanish, the limit in (7.25) exists if j7¢J. We shall 
suppose, using the results of Theorem 7.1 and the fact that for any subscript 7 
Jit, * * * » Yin iS a Set of absolute probabilities corresponding to the given ma- 
trix, that the absolute probabilities f, - - - , p, are given by 


cm) 
(7.26) p= — Daw =— lim 
j=1 N m=1 
We can write p¢*” in the form 


(u+v) () 


(7.27) be = Len pn t+ 


leJ 


Then letting v become infinite in (7.27), we obtain 


lim sup bin Pit + Spit 


let 


(7.28) 


so that 


(7.29) lim sup pj, — lim inf pe S 


mo m2 


Now as in the proof of Theorem 7.1 we know that (7.8) is true, and combining 
this with (7.29) we obtain 


lim sup pj, = lim inf ps; 


as was to be proved. 
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Conversely, if the limit in (7.25) exists for every pair of subscripts /, k, 
Theorem 6.5 shows that there can be no angle variables in any process corre- 
sponding to the matrix (;,); thus the matrix (;,) has no angle variables. 

Proof of (ii). If there are no angle variables corresponding to the given ma- 
trix, there can be no root of the characteristic equation of the matrix (p;x) 
of modulus 1, other than 1; for if ¢ is such a root, there is a set of constants 
41, °° Xn, not all 0, such that 


(7.30) PikXk = j= | 
k=l 

Then 

(7.31) = GF nym =1,2,--- 


k=l 


When m becomes infinite, the left side converges, to >. .=19;.%, Whereas the 
right side, if 7 is chosen so that x;#0, does not converge. Then the character- 
istic root c is impossible. 

Conversely, suppose that there is no root of the characteristic equation 
of modulus 1 other than 1. Let f:, - - - , p, be any set of absolute probabilities 
corresponding to the given matrix. We shall prove that the temporally homo- 
geneous process defined in terms of f;, - - - , pn and (p;.) can have no angle 
variable, by showing that the existence of an angle variable implies the exist- 
ence of a root (not equal to 1) of the characteristic equation, of modulus 1. 
If ¥(xo) is an angle variable, and if (7) =y;, (6.18) becomes 


n 


(7.32) = i, [cl =1,¢ #1, 
k=l 

for those values of 7 for which p;>0. Let these make up the set of subscripts 

not belonging to the set J. If 7 ¢ J, we have seen above that p;, =0 for k e J, so 

that changing y, for k e J does not affect (7.32). We shall attempt to re-define 

y; for j « J to make (7.32) valid for all 7. To do this, we must solve the follow- 

ing system of equations for y;,7 «J: 


(7.33) =i, jeJ; 
k=l 
or, if we set 


D Pie = a, 


the system 


2 
4 


n 
n 
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(7.34) = j;-—a;, jeJ, 


that is, 


(7.35) LD (pin — ube = — aj, 

If these equations have a solution, this solution, when combined with the y,’s 
for j ¢ J, satisfies (7.32) for all 7, so that the matrix (p;,) has the number c as a 
root of its characteristic equation. On the other hand, if the system (7.35) has 
no solution, the matrix (p;,), with 7, k restricted to J, has c as a root of its 
characteristic equation, so that there is a set of numbers {y;}, 7 € J, not all 0, 
satisfying 

ke 
But then if y; is defined as 0 for7 ¢ J, the set y1, - - - , Yn provides a non-trivial 
solution of (7.32) for all 7, so that again the matrix (p;,) has the number c as a 
root of its characteristic equation. In any case then, the hypothesis that there 
is an angle variable implies the existence of a root c¥1, |c| =1, (which is the 
characteristic value corresponding to the angle variable) of the characteristic 
equation of the matrix. 

Proof of (iii). If there are no angle variables, the matrix (p;,) cannot be 
put in the form described; for it is readily verified that if (p;.) is in this form, 
the matrices (p{*”), (p@’+»), - - - are of the same form, whereas the mat- 
rices (p@), (pit), - - - are of the same form except that the non- 
zero blocks of R; are the matrices determined by (JiJ3), (J2J), - - - instead 
of (JI:J2), (J2Js), Then if the submatrix R, of (g;,) must have 
only 0 elements; but this contradicts the fact that the sum of the elements in 
each row of R; is 1. (It would also have been possible to prove this part by 
giving an explicit definition of an angle variable corresponding to the given 
matrix.) 

Conversely, suppose that there is an angle variable corresponding to some 
choice fi, ---, pn of the absolute probabilities, so that (Theorem 6.3 (i)) 
there is a function ¥(xo) such that 


(7.37) ¥(x1) = of (x0), | | =1,c#¥il, 


except on an (xo, «;)-set of P-measure 0. Since ¥(xo) (which takes on at most 
n values) necessarily takes on some value on a set of positive P-measure, c 
must be a root of unity (Theorem 6.3 (ii)). This fact will appear again below. 
Let ¥(j) =y;. Let a1, a2, - - - be those non-zero values in the set - - , Wn 
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for which the corresponding probability #; is positive. Define J; as the set of 
subscripts 7 for which p;>0, ahd Y;=a,. Let m; be the number of subscripts 
in J;. We can assume (transforming the matrix as described above if neces- 
sary) that J, consists of the first m subscripts, J: of the next 72 subscripts, 
and so on. According to (7.37), some a; will necessarily be ca;, and we can 
suppose it to be a2. In the same way, some a; will necessarily be caz, and we 


can assume it to be az (unless it is a,), - - - . Continuing this, we will neces- 
sarily find a first integer »y>1, such that if a, a2, - - - , a, are chosen succes- 
sively as described, so that a2=ca, , a, =c’—'a,, the next application of 


the algorithm will give ca, = a,, and hence c’=1. Then c is a rth root of unity, 
and 1<v<n. If n,1<xoSm,* (so that y(xo) =a,), then if r<y, it follows that 
¥(x1) necessarily (if r =v, =a; necessarily), if we neglect x;-sets of 
P-measure 0, that is, subscripts & for which p;, =0; and 


k ¢ pi > 0, gul,---,»—1, 


{ 
Pi Wek beh 


These equations describe the (J:+ - - - +J,)* matrix R;. The fact that the 
Q-set determined by the condition x; ¢ J:+ - - - +J, is invariant under the 
transformation T up to a set of P-measure 0 means that the matrix (p;,) 
can be put in the form of Fig. 1, as was shown in the proof of the preceding 
theorem, except that in this case the matrices R2, R; may be absent. The 
R; is absent if every ; is positive; R: is absent if the Q-set, determined by the 
condition Ji+ --+-+J,, has P-measure 1. 


THEOREM 7.4. The matrix (p;x) is metrically transitive and has no angle 

variables if and only if 
(i) for every pair of subscripts j, k, littm.. pi exists and is independent of 

(ii) the root 1 is the only root of the characteristic equation of modulus 1 and 
is itself a simple root; or 

(iii) the matrix cannot be put in the form of Fig. 1 if either both R, and Rz 
are present, or if R; is present and has the form of Fig. 2; or 

(iv) each matrix (pD), - - - , (pS?) is metrically transitive. 

Only the last part requires any comment. Suppose that the matrix (;x) 
is metrically transitive and has no angle variables. Then p—p, (where 
pi, - * +, bn is the uniquely determined set of absolute probabilities). In par- 
ticular, lim... p%” =x, so that, according to (i) the matrix (p%) is metri- 
cally transitive. Conversely, suppose that the matrices (p?), - - - , (p®) are 


* Take no=0. 
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metrically transitive. We shall prove that the matrix (;,) can have no angle 
variable. If there is an angle variable, its characteristic value c, ( | c| =1,c1), 
is a root of the characteristic equation of the matrix (p;x) (cf. the proof of 
Theorem 7.3 (iii)), and we have seen that this number c must be a root of 
unity of order less than or equal to m. Then there is a set of numbers 
(not all 0) such that 


iP ik = CXx, 


j=l 


Moreover, if we sum over k, 


= 

j=l k=l 
which implies that >°)_,«;=0. If v is chosen so that c’=1, (v<m), then 
ip? =c’x, Now the (uniquely defined) absolute probabilities 
pi, ++, pn satisfy the equations 


Pibix = pe; 

j=l 
If the matrix (p%) is to be metrically transitive, the sets ~:,---, Pn, 
%1, °° +, %, must be linearly dependent (Theorem 7.2 (iii)), but this is im- 
possible, since 


=1, = 0. 

j=l j=l 
The hypothesis that the matrix (p;,) has an angle variable has thus led to a 
contradiction. 

III. In this example, special conditions must be imposed on the condi- 
tional probability density p(x, y) to insure the existence of an absolute proba- 
bility function. If there is an absolute probability function, it was shown in §5 
that it is determined by an X-integrable density function p(x) which can be 
supposed to satisfy 


for all y e X. In the following we shall mean by an absolute probability density 
p(x) an X-integrable function, which is non-negative and satisfies (7.38) for 
all y. It follows that two absolute probability densities which are equal almost 
everywhere on X are identical. We shall assume throughout that p\™ (x, y), for 
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some integer m=1, satisfies the condition of uniform integrability discussed in §5 
insuring the existence of at least one absolute probability density function. In 
fact, we have seen in the corollary to Theorem 4.1 that if Q(£) is any proba- 
bility measure defined on the sets of F,, we can obtain a probability density 
p(x) in the form 


The denumerability hypothesis on the field F, and the uniform integrability 
hypothesis were employed in order to obtain a certain compactness in an ag- 
gregate of set functions (cf. §4) through which (7.39) was derived. More ab- 
stract formulations are possible (cf. the hypotheses of Kryloff and Bogolioi- 
boff (I)). 


THEOREM 7.5. (i) Except Gg on an Q-set of P-measure 0, 


(7.40) jim (xm, ¥) = y)* 


exists for each value of y for which p(y) is finite-valued. 
(ii) The limit 


(7.41) tim — f Ce, = 
Noo N moi 

exists for all x X and every set E F,. If there is a value of m for which y) 

is a bounded function of x a“ each value of ¥, 


(7.42) lim — p°™(x, y) = g(x, 9) 


N-o 
exists for all x, vy. 


Part (i) of the theorem is new. Part (ii), which is an integrated form of (i), 
generalizes Theorem 7.1 (ii). The existence of the limit in (7.42) was proved 
by Fréchet (II, p. 81) who supposes that X is the closed cover of a bounded 
domain of euclidean space, and that there is a value of m such that p(x, y) 
is a bounded function. Fréchet’s results were generalized by Kryloff and 
Bogoliotboff (I) to a form which is substantially identical to Theorem 7.5 
(ii) (cf. the note above on the hypotheses of the present discussion) but less 
general than Theorem 6.5 (ii). 

Proof of (i). Let (x) be an absolute probability density corresponding to 
the given conditional probability density. For each fixed value of y for which 


* Cf. the note to Theorem 7.1 (i). 
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p(y) < ©, p(xo, y) is a P-measurable function, depending only on xo, which is 
integrable on Q; namely 


ff nap = = 960). 


Then according to Theorem 6.1 (if ¢(w) = p(xo, y)) the limit in (7.40) exists 
almost everywhere on 0. 

Proof of (ii). This proof follows the outline of the special case considered 
in Theorem 7.1 (ii). There is an X-measurable function ¢(x), such that 


o(x) > 0, f eae = 1.* 


If in (7.39) we define Q(E) as {,¢(x)dx, an absolute probability density p(x) 
is obtained in the form 

1 Nv 
(7.39’) f p(x)dx = lim — >) | o(x)dx f p™(x, y)dy, 
E E 


N, m=1 


where the sequence {N,} is independent of E. We shall suppose that p(x) is 
defined by (7.39’). If we define II (x, y) by 


1 N 
TE (x, y) y), 
N m=1 
we find (cf. equation (7.6)) that 


v 
(7.43) y) (x, y) = f p(x, (2, y)dz. 


Let Ey be the X-set on which p(x) =0. Then if E ¢ F,, the limit in (7.41) exists, 
according to Theorem 6.1 (ii), for x almost everywhere in the complement of 
Eo, and 


(7.44) lim T(x, y)dy = q(x, E), xeCEo, 
E 


N-- 


except possibly for an X-set of measure 0. According to equation (7.41), 


1 Nv 
lim | ¢(x)dx f y)dy = lim | o(x)dx f y)dy = 0. 
E Eo 


yoo 0 N, m=1 


* If /idx=d< ©, we can take (x) =1/A. Otherwise we use the fact (cf. §5) that there is an in- 
creasing sequence E:€ - - - of X-measurable sets such that) and that g,ldx=hj< 
There is a sequence of positive numbers {¢,} such that ¢o-+)_; €n(Ansi—An) = 1, and we define ¢(x) 
as «9 on and e, on En for n>1. 
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This implies that 


(7.45) lim inf f (x, y)dy = 0. 
Equation (7.45) means that some subsequence {¢(x)/z,p°i(x, y)} of 
{b(x)fz,p\™(x, y)dy}, when integrated over X, converges to 0. This implies 
that $(x)fz,p?(x, y)dy converges in measure to 0* which in turn implies 
that a further subsequence { (x) [p‘*(x, y)dy} converges to 0 for almost all x. 
Since ¢(x) >0, 


lim p°(x, y)dy = 0 
Eo 


for almost all x. Now 


f pet) (x, y)dy -fi f p°i(z, y)dy. 
Eo Eo 


The z-integrand (for x fixed), p(x, 2) [p‘*?(z, y)dy, converges to 0, for almost 
all z, according to what has been just shown, and is less than or equal to the 
z-integrable function p(x, z). Then, by a well known integration theorem, we 
can go to the limit under the integral sign, so that 


(7.46) lim pit (x, y)\dy = 0, xeX. 

jroo Eo 
If both sides of (7.43) are integrated with respect to y over a set £ in the field, - 
then if u->«, we obtain 


lim inf TI (x, y)dy = lim inf p(x, ads f Tl (z, y)dy 
E E 


= lim inf p(x, adds f Tl (z, y)dy. 
CE, E 

Now the z-integrand p(x, z)f,II™(z, y)dy converges, as p>, to 

p(x, z)q(z, E) for almost all z in CEo, according to (7.44). Moreover this 

integrand is less than or equal to the z-integrable function p(x, z). Then we 

can go to the limit under the integral sign, and obtain 


liminf | 11% (x, y)\dy = f p(x, 2)q(z, E)dz, vy = 1, 2,- 
E 


Noo 


* Convergence in measure was defined and discussed by F. Riesz, Comptes Rendus de l’Académie 
des Sciences, Paris, vol. 148 (1909), pp. 1303-1305. 
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On the other hand, 


lim sup | (x, y)dy = lim sup p(x, z)dz (z, y)d 
E E E 
0 


N- ow 


+f 90%, ds f yay} 
E 


CEy 


< f p(x, 2)ds + f p(x, s)q(s, Eds, 
Ey 


1,2,-- 
Then 


(7.49) limsup | I(x, y)dy — lim inf f 1%, y)dy <f p(x, 2)dz, 
Nox E E Eo 


y=1,2,---. 


If v is allowed to increase without limit through the sequence {b;+1} of 
(7.46), it follows that the limit in (7.41) exists for all x, as was to be proved. 
Now in addition to the other hypotheses, suppose that for some integer y, 
p(x, y) is bounded in x for each y. The existence of the limit in (7.41) is 
readily seen to imply that if f(«) is a bounded X-measurable function, 


lim — p°™(x, 
N m=1 


exists for all x. If we take f(z) = p™(z, y), then 
N+u 


1 
lim — p°™(x, 2)p(z, yds = lim — y) 
N--o N- N 

N 


1 
= lim — p(™(x, 


N-« m=1 
exists for all x, y, as was to be proved. 

A given function p(x, y) may correspond to several temporally homogene- 
ous processes. If all these processes are metrically transitive, the function 
p(x, y) will be called metrically transitive. If none of these processes has angle 
variables, p(x, y) will be said to have no angle variables. Otherwise p(x, y) 
will be said to be not metrically transitive, or to have angle variables, as the 
case may be. 


THEOREM 7.6. The function p(x, y) is metrically transitive if and only if 
(i) there is only a single absolute probability density p(x); or 
(ii) g(Z, x) (or g(x, y) if the hypotheses of the second part of Theorem 7.5 
(ii) are satisfied) is independent of x; or 
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(iii) the integral equation 
(7.50) f naz = von, 


in the X-measurable integrable function p(x), has only a single linearly inde pend- 
ent solution; or 
(iv) there are disjunct X-sets F,, F, in F,, of positive X-measure, such that 


(7.51) p(x, y) = 0, veF;, y ¢F;, G= 1, 2), 
if we neglect (x, y)-sets of (x, y)-measure 0.* 
The various parts of this theorem are proved by exactly the methods of 


the proof of Theorem 7.2. In the case of metric transitivity, the limit g(x, E) 
can be expressed simply by 


(7.52) ate, B) = f 


(This equation corresponds to the equation q;;,= p; for all 7, k in case II, when 
the given matrix is metrically transitive.) As an example of the proofs used, 
we prove (iii). 

Proof of (iii). If the function p(x, y) is metrically transitive, let p(x) be the 
uniquely determined probability density. Then (7.52) is true. If y(x) is X- 
measurable and integrable, and satisfies (7.50), it follows that 


dx | y)dy = dy, 
fue (2, 9)dy fw y 

and if this becomest 

7.53 d \dy = dy. 

(7.53) J f = f voray 


If a is defined as fy(x)dx, (7.53) implies, since E is arbitrary, that y(y) =ap(y) 
for almost all y. Since the functions p(y), and y(y) satisfy their integral equa- 
tion identically, ¥(y) =ap(y), as was to be proved. Conversely, if there is a 
solution of (7.52), uniquely determined (up to a constant factor), then the 
absolute probability density, which is a solution, is uniquely determined; 
hence there is metric transitivity, according to part (i). 


* The equivalence of (i), (ii), (iii) was proved by Fréchet (I), in the case (described above) which 
he considered. The equivalence of (ii) and (iv) was announced by Kryloff and Bogoliodboff (II), 
whose hypotheses apparently exclude the possibility of exceptional values in (7.51). 

The x-integrand y(x) (x, y)dy converges for all x to¥(x)/pp(y)dy and is uniformly less 
than or equal to ¥(x); so we can go to the limit under the x-integral sign. 
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There is some interest in developing this theorem further.* Suppose a 
function p(x, y) is not metrically transitive. Then X sets F,, F2 exist as de- 
scribed in (iv). Now it may be that F; itself contains X-measurable sets 
F 1, F 2, of positive measure, such that p(x, y) =0 if x € F;;, y ¢ Fi;, if we neg- 
lect (x, y)-sets of (x, y)-measure 0. In the contrary case the function p(x, y), 
considered only for x ¢ F;, y € F;, is metrically transitive. Now it is readily 
seen that the uniform integrability condition prevents the existence of infi- 
nitely many X-measurable sets Fi, F2, - - - , of positive X-measure, such that 
if xe F;, y ¢ F;, then p(x, y) =0 neglecting (x, y)-sets of (x, y)-measure 0. Hence 
there is at most a finite number yp of such sets, and p(x, y) is metrically transi- 
tive when considered defined only for x, y € F;. Let p;(x) be the corresponding 
uniquely defined absolute probability density, and define p;(x) =0 if x ¢ F;. 
Then if - - -,p,are non-negative numbers with sum 1, p(x) pipi(x) 
is an absolute probability density for p(x, y), and conversely, any absolute 
probability density for p(x, y) is such a linear combination. The limit g(x, Z) 
of Theorem 7.6 (ii) must be 


THEOREM 7.7. The function p(x, y) has no angle variables if and only if 
(i) whenever E ¢ F., 


(7.54) lim p°™(x, y)dy = q(x, E) 
E 
exists for all x € X (or, in case the hypotheses of the second part of Theorem 7.5 
(ii) are satisfied, whenever limm.. p(x, y) exists for all x, y); or 
(ii) it is impossible to find disjunct -sets E,,---, E,, v>1, of positive 
X-measure, such that (if we neglect an (x, y)-set of (x, y)-measure 0), 


(7.55) p(x, y) = 0, 


y ¢ Eras, r=1,---,v—1, 


xe £,, y¢e Ey. 


In case II, there seems to be no essential difference between the existence 
of angle variables and the existence of solutions (not equal to 1, of modulus 1) 
of the characteristic equation of the given matrix. However, in the present 
case it seems possible to obtain more general results by considering angle 
variables rather than solutions of the integral equation 


f Vy) P(x, »)dy = ofa). 


The greater adaptability of angle variables is shown, for example, by the fact 


* Cf. Kryloff und Bogoliodboff (I, II), Doeblin and Fortet (I). 
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that the existence of an angle variable implies the existence of a bounded 
angle variable (as we have seen above). Fréchet was able to extend the usual 
Fredholm theory of integral equations to his kernels p(x, y), and so could ob- 
tain the complete analogue of Theorem 7.3; and in the present treatment also, 
if the Fredholm theory is available, the proof of Theorem 7.3 goes right 
through in case ITI. 

Proof of (i). Suppose that p(x, y) has no angle variables. Let p(x) be an 
absolute probability density corresponding to p(x, y). We show first that the 
hypotheses of Theorem 6.5 are satisfied, so that there is a function ¢(x, y) 
such that for every set E ¢ F, and for all x (except perhaps values in a set on 
which the integral of p(x) vanishes), 


(7.56) f y)dy = Joc, y)p(y)dy. 


Let Ey be the x-set on which p(x) =0. Since 
(7.57) fo, ay = = 0, 
Eo Eo 


Jr,p(x, y)dy=0, except possibly on an x-set on which the integral of p(x) 
vanishes. Then if (x, y) is defined by 


o(x, y) = p(x, y)/p(y), p(y) > 0 
y) = 0, p(y) = 0, 


it is readily verified that (7.56) holds, except possibly for values of x for which 
the integral of p(x) vanishes. The hypotheses of Theorem 6.5 are therefore 
satisfied, and as the process has, by hypothesis, no angle variables, whatever 
absolute probability density is chosen, the limit in (7.56) must exist almost 
everywhere on CE). A suitable generalization of the proof of the correspond- 
ing part of Theorem 7.3 then completes the proof. Conversely, if the limit in 
(7.54) exists for all x, Theorem 6.5 states that the function p(x, y) has no 
angle variables. The transition from (7.54) to the unintegrated form 
(limm.« P(x, y)) is easily made as in the proof of Theorem 7.5 

Proof of (ii). If there are sets £,, - - - , E, as described in (ii), an angle 
variable can easily be explicitly defined, or the proof of the corresponding 
part of Theorem 7.3 can be generalized to show that p(x, y) must have angle 
variables. Conversely suppose there is an angle variable, so that (cf. Theorem 
6.3) there is an X-measurable function ¥(x) such that 


(7.59) = of(x0), [cl =1,c¥1, 


(7.58) 


if we neglect an (xo, x:)-set of P-measure 0. Then 
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so that 

(7.60) p(x) p(x, y) W(y) — = 0 
for almost all (x, y), and 

(7.61) p(x, — = 0, En, 


if we neglect (x, y)-sets of zero measure. Let é be a point not in Zp, such that 


»ddy = — = 0 


(where the second is to hold for almost all values of y). This may exclude 
(cf. (7.5)) a &-set of measure 0, besides Eo. Then, since p(é, y) >0 on CE, ona 
set of positive y-measure (its integral over C Ep is 1), ¥(x1) takes on the value 
cy(é) on a set of positive P-measure. Now let a be any value, not equal 0, 
assumed by y on a set of positive P-measure. According to Theorem 6.3 (iii), 
c must be a root of unity. Let v be the smallest exponent r for which c* =1. 
The function ¥(«) takes on values a, ca, - - - ,c’-!aonsubsets A, - - - , E, of 
The fact that if xo Z,, then x, € E,4:,(r=1, - - - , v—1), (if xo Z,, then 
x, € E;) necessarily, if we neglect sets of P-measure 0, and that the set deter- 
mined by the condition x) ¢ E+ - - - +£, is invariant up to a set of P-meas- 
ure 0, implies the conditions of (7.55). 

The set E,+ - - - +£, is one of the sets F (corresponding to invariant 
Q-sets) analyzed above. In general there will be then a finite number of F-sets, 
and if the function p(x, y) has any angle variables, one or more of these 
F-sets will be divided into a finite number of E-sets.* 

Combining the previous theorems we obtain finally the theorem: 


THEOREM 7.8. The function p(x, y) is metrically transitive and has no angle 
variables if and only if 
(i) whenever E € F;, lito [ep (x, y)dy exists for all x and is independ- 
ent of x (or, in case the conditions of the second part of Theorem 7.5 (ii) are satis- 
fied, if litim.. p(x, y) exists for all x, y and is independent of x); or 
(ii) it is impossible to find sets as described in Theorems 7.6 (iv) or 7.7 (ii); 
or 
(iii) every function p(x, y), p(x, y), - + - is metrically transitive. 


* This decomposition of X was announced by Doeblin and Fortet (I). The hypothesis, made here, 
that absolute and conditional probabilities are given by density functions, is unnecessary, as we only 
need enough hypotheses to assure the fact that an angle variable will assume one of its values on a 
set of positive P-measure. 
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IV. The case requires little comment. The properties of the transforma- 
tion S correspond to similar properties of 7; for example, if one has angle 
variables, so has the other. 
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CORRECTION TO THE PAPER 
“THE MULTINOMIAL SOLID AND THE CHI TEST’™* 


BY 
BURTON H. CAMP 


The following note should be appended at the bottom of page 141:T 

However, Pearson’s sum (loc. cit.) extends over the interior of an ellipsoid 
in (m—1)-space, instead of over the interior of our parallelopiped. The two 
figures overlap, but do not coincide; hence it should not be asserted, as in §5, 
that the estimated error surely applies to the Pearson tables. An attempt to 
extend the theory further has encountered so many difficulties that the author 
would be compelled at present to adopt the error found in §5 as his best esti- 
mate of the error inherent also in Pearson’s approximation. But it might not 
always be a close estimate. The equations (20) and the restrictions (7) do 
apply to both cases. If an expression were to be found for the sum Q, when 
extended over the interior of our parallelopiped, one would have a test of 
significance which would rival Pearson’s chi-square test, and which, unlike 
his, would reduce to the ordinary procedure when m=2 (point binomial). It 
would give the probability that a sample (fi, - - - , fm) had class frequencies 
as near the ideal as those observed, whereas Pearson’s test gives the proba- 
bility of a sample’s having class frequencies whose total probability would be 
as great as the total probability of the sample observed. These two possible 
tests are not identical. The form just suggested is the natural extension of the 
method usually preferred for the point binomial, and the estimated error 
could be made to apply to it with exactness; but this estimate is still useful 
as an approximation in Pearson’s case, and the restrictions (7) continue to be 
needed. 


* Received by the editors December 20, 1937. 
t These Transactions, vol. 31 (1929), pp. 133-144. 
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GEOMETRY IN AN z-DIMENSIONAL SPACE 
WITH THE ARC LENGTH 


s= f {A (x, + B(x, 


BY 
AKITSUGU KAWAGUCHI 


Geometry in the manifold K,,“” of line elements of a higher order, that is, 
(x, dx, ---,d™x) or (x,x’, - - - ,x), in a space of m dimensions with a point 
coordinate system x‘, (¢=1,2, - - -,), was first furnished with its foundation 
by the present author.f In close connection with this theory, there are two 
important problems. The first is the geometry of paths in this space, that is, 
the geometrical study of the system of paths x‘=<x‘(#) which are defined by 
the ordinary differential equations of the form 


+ T(t, x, = 0, 


where x ‘ denotes d*x‘/d# and ¢ is a parameter. The second is a case of metric 
geometry in this space, that is, the study of the geometrical properties char- 
acterized by a metric which gives as the arc length s of a curve x‘=x‘(#) 


s= f F(t, x, , x™)dt. 


The first problem is a generalization of the so-called geometry of paths in 
a space with a generalized affine connectionf{ x’’'+TI‘(¢, x, x’) =0, and was 


* Presented to the Society, December 28, 1937; received by the editors February 8, 1937, and, 
in revised form, August 3, 1937. 

A summary of the present paper has already appeared in Proceedings of the Imperial Academy, 
vol. 12 (1936), pp. 205-208. 

¢ A. Kawaguchi, Die Differentialgeometrie in der verallgemeinerten Mannigfaltigkeit, Rendiconti 
del Circolo Matematico di Palermo, vol. 56 (1932), pp. 245-276; Theory of connections in the general- 
ised Finsler manifold, Proceedings of the Imperial Academy, vol. 7 (1931), pp. 211-214; ibid., vol. 8 
(1932), pp. 340-343; ibid., vol. 9 (1933), pp. 347-350. 

tL. Berwald, Untersuchung iiber die Kriimmung allgemeiner metrischer Raume auf Grund des in 
ihnen herrschenden Parallelismus, Mathematische Zeitschrift, vol. 25 (1926), pp. 40-73; J. Douglas, 
The general geometry of paths, Annals of Mathematics, (2), vol. 29 (1927), pp. 143-168; D.D. Kosambi, 
Parallelism and path-space, Mathematische Zeitschrift, vol. 37 (1933), pp. 608-618, and E. Cartan, 
Observations sur le memoire précédent, ibid., pp. 619-622; D. D. Kosambi, Systems of differential 
equations of the second order, Quarterly Journal of Mathematics (Oxford), vol. 6 (1935), pp. 1-12; 
W. Slebodzifiski, Sur deux connections affines généralisées, Prace Matematyczno-Fizyczne, vol. 43 
(1935), pp. 1-39. 
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first propounded by D. D. Kosambi.* Although he derived many interesting 
results, his theory seems to be systematically not yet complete, and there are 
many irregularities. The present writer and H. Hombu have reconstructed 
this theory from another and more general standpoint, by which the irregular- 
ities of the theory of Kosambi exhibit themselves.t 

On the other hand, the second problem is a generalization of a Finsler 
space and there have been many attempts to solve it,f but it remains yet 
unsolved. Recently E. Cartan§ has introduced a connection, which is related 
intrinsically with an integral [F (x, y, y’, y’’)dx, in a plane under the group 
of all contact transformations by use of several invariant Pfaffians and 
has discussed in detail a special case where the integral has the form 
[(Ay’’+B)dx. This theory of Cartan has been extended by H. Hombu!] 
to the case where the integral possesses the form /F(x, y, y’, y’’, y’’’)dx. 
These theories give a foundation to the invariant theory of the integral un- 
der contact transformations and take the first step towards the general prob- 
lem. But unfortunately, by reason of its restricted method which is available 
only for a plane, these theories cannot be extended to an n-dimensional space 
as they stand. Furthermore it is more desirable to set up the theory under the 
group of all point transformations rather than contact transformations. 

In the present work it is proposed to establish the foundation of the ge- 
ometry in an n-dimensional space with arc length given by the integral 
x’)}edt, by introducing several connections under the 
group of all point transformations. Although the integral has a special form, 
this would be the second step towards the general problem; and in this special 
case many concrete results are obtainable, which may be of interest. The 
integrand does not contain the parameter ¢ explicitly, and I shall investigate 
mainly only those properties that are invariant under any (analytic) trans- 
formation of the parameter ¢, for the reason that only these properties have 
geometrical meaning, that is, are related intrinsically to the base curves. This 


* D. D. Kosambi, Path-spaces of higher order, Quarterly Journal of Mathematics (Oxford), vol. 7 
(1936), pp. 97-104. 

¢ A. Kawaguchi and H. Hombu, Die Geometrie des Systems der partiellen Differentialgleichungen 
hiherer Ordnung, Journal of the Faculty of Science, Hokkaido Imperial University, vol. 6 (1937), 
pp. 21-02. 

tH. V. Craig, On a generalized tangent vector, American Journal of Mathematics, vol. 57 (1935), 
pp. 457-462; J. L. Synge, Some intrinsic and derived vectors in a Kawaguchi space, ibid., pp. 679-691; 
A. Kawaguchi, Some intrinsic derivations in a generalized space, Proceedings of the Imperial Academy, 
vol. 12 (1936), pp. 149-151; Certain identities in a generalized space, ibid., pp. 152-155. 

§ E. Cartan, La géométrie de l’intégrale [F(x, y, y', y"’)dx, Journal de Mathématiques, (9), vol. 15 
(1936), pp. 42-69. 

|| H. Hombu, Jnvariantentheorie des Integrals [F(x, y, y', y'’, y'")dx, Proceedings of the Imperial 
Academy, vol. 12 (1936), pp. 156-158; Geometrie des Integrals [(Ly'’’+M)dzx, ibid., pp. 159-161. 
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restriction makes the theory somewhat complicated. The theory can be gen- 
eralized without difficulty to the case where the integral is of the form 


f {Ai(x, x’, a!) B(x, x’, x!) | 
or 
f {ai(x, x')aj(x, + 2b(x, x’ + c(x, x’) } 


Chapter 1 is devoted to discussion of the general case, where p#3/2. In 
this case there exists a system of paths having the equations of the form 
x’’t+Ti(x, x’) =0, which leads to the definition not only of the base connec- 
tion but also of a connection C, by recalling the geometry of paths in the 
manifold K,,. There is another connection C’ which is stated in Chapter 2 
and which has a form similar to that of Cartan in a Finsler space. 


CHAPTER 1. BASE CONNECTION AND CONNECTION C 


1. Let us consider an n-dimensional metric space such that the arc length 
of a curve x‘=<‘(#) is given by the integral 


(1.1) s= f {A (ax, + B(x, 


where =dx‘/dt, and A; and B are differentiable functions 
of x‘ and x’‘. In order that the arc length should be related intrinsically to 
the curve, that is, that it should remain unaltered by a transformation of the 
parameter ¢, we must have 


(1.2) = 0, 
(1.3) 2A + (Anca *® + = + B), 
where 
OA, OB 
Ak) = (i) 
x* Ox’? 


We shall denote partial differentiation by x’ with (z) and that by x‘ with 7 
throughout this paper. Equation (1.3) implies 


* T. Ohkubo, Base connections in a special Kawaguchi space, Journal of the Faculty of Science, 
Hokkaido Imperial University, vol. 5 (1936), pp. 167-188. 

t+ H. Hokari, Die Geometrie des Integrals {(asajx’’‘x''i+-2ba;x’''+-c)"/?dt, Proceedings of the Im- 
perial Academy, vol. 12 (1936), pp. 209-212. 

t E. Cartan, Les espaces de Finsler, Actualités Scientifiques et Industrielles, vol. 79, 1934. 
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(1.4) A = (p — 2)Ax, = pB. 


Thus, the A; are homogeneous of degree p—2 with regard to the x’/, and B 
is homogeneous of degree p. From (1.2) it follows, on differentiating by x’, 
that 


(1.5) == Aj. 


2. The point transformation y‘=y‘(x’) gives rise to the following rela- 
tions: 


(1.6) @ Pix", = + , 
where 
oy’ Ox" 


Since the arc length (1.1) is a scalar, it can be seen at once that A; is a vector. 
But B is not a scalar and becomes, by the point transformation, 


(1.7) B(y, y’) = B(x, x’) — A,(x, nx! x". 
Besides A ; we get two other intrinsic vectors 
(1.8) Ti = — — + Buy, 
E; = (Aicey — Agcy) + (Aiayay — 
(1.9) + — Aggie” + Ans + Aik — + 
+ Ag — + Bj. 
The vector 7; is the Craig vector* 


d OF oF 


dt = ax’ 


d? OF d OF OF 
dt? = =dt Oxi 


of the function F = A ,x’’'+B. It can be verified easily that 


d 
(1.10) Tex’ = + B), = (1 — p) + B). 


* See H. V. Craig, On a generalized tangent vector, loc. cit. 


and £; is the Euler vector 
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When s stands for ¢, we get the normalized vectors 7* and E#* for T; and E;. 
Then we have 


(1.10’) Tixi=p, E*xi=0, 


where the dot denotes differentiation by s. Among these vectors there are the 
relations 
rs Q2 34 d*t 


d*t 
P "\ds ds? 


( DA 3) + dt 
4 ds dst $\ds) 


3. If 2p=3, the determinant of the tensor 
(1.12) Git = — 


(1.11) EF 


& 

S| 


vanishes, identically, because 
= 0. 


Now we assume that p#3/2, and that the determinant does not vanish iden- 
tically; then we have from (1.8) 


(1.13) = — = x/" + 
where 
(1 .14) = (2A Byy)G" and G; i= 


The functions I are homogeneous of degree two with regard to the x’?. As 
the relation Gi.x"* =(2p—3)A; holds, the last equation of (1.14) shows that 


1 
(1.15) AG' = 
From G;,;G*' =6;' we get also 
(1.15’) AG# = — — 
p 
which leads to 
(1.16) F = 


From (1.14) we obtain further 


i 

| 

| 
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(1.17) 2r'A, = B. 


Now we shall introduce, as in a Finsler space, the manifold K,{?) of line 
elements («, x’) in the space; and for this manifold (1.13) gives the base con- 
nection in our space.* 

4. Let v‘ be a contravariant vector field in K,‘” , homogeneous of degree 
zero with regard to the x’’, that is, independent of a choice of the parameter f. 
Then a new vector can be derived from v‘ in connection with the base 
connection: 

dv' 


1.18 
( ) dt dt + (i) T 


which is an absolute derivative of the vector v‘ along the curve and is inde- 
pendent of transformation of ¢. From (1.18) we can define further the abso- 
lute differential corresponding to a displacement from a line element (x, x’) 
to a neighboring line element (x+dx, x’+dx’) 


(1.19) bu = dv’ + 


This absolute differential remains unaltered by any change of ¢. By any trans- 
formation of coordinates I'j,,, are transformed in the same way as the pa- 
rameters of an affine connection and are homogeneous of degree zero with 
regard to the «’*, that is, they are invariant under transformation of ¢. For 
this reason the connection C, defined by (1.19), characterizes the geometrical 
properties of our space. It is not necessary, therefore, to take the line ele- 
ments (x, «’) along any curve; they can be taken arbitrarily at any point. 

On account of the base connection (1.13) we have for a displacement of a 
line element 


(1.20) = dx’t + 


On account of this, we can write (1.19) in the form 


Oxi 
= dxiV + 
where 


* For the base connection reference may be made to A. Kawaguchi, Die Differentialgeometrie in 
der verallgemeinerten Mannigfaltigkeit, loc. cit. 

t As x@li isa vector, we know from (1.13) that I’! behave under a coordinate transformation in 
the same way as affine connection parameters 7;; multiplied by x’ix’*. From this fact it can be con- 
cluded that 5v‘/dt is a contravariant vector. 
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dvi 


Ox'i 


k i k P i 


which are defined as the covariant derivatives of a vector v‘. V/v‘ are multi- 
plied by a factor dt/di for a change of the parameter /=/(¢). For a covariant 
vector or a tensor we can define covariant derivatives in the same manner as 
in usual tensor calculus, for example, 


(1.23) 60; = dv; — 


In particular we obtain for A; 


(1.24) Wg @ AG 4s = Aa; 
for which the following relations hold: 

6A; 

dt 


These can be verified without difficulty from (1.4), (1.5), and (1.14). 
The covariant differential of a relative tensor v‘ of weight & is also given 
as follows: 


(1.25) «IV ;A; = 0, = 0, 


= 


(1.26) = dy + da. kv 


Viv @ tla — 
api 
Vivi = 
Ox") 


5. If the components of a vector v‘ are homogeneous of degree / with re- 
gard to the «’/, then they are components of a geometrical object in some 
sense. But the 6v‘ in the sense of (1.19) have no geometrical meaning. That 
is, the dv‘ are not independent of the parameter ¢ but defined when and only 
when a choice of a parameter # is held fixed. In fact, by transformation of 
the parameter ¢ one obtains 


dt\* fdt\'" dt 
= (5) + d—, 
dt dt dt 


which carries no geometrical meaning. Although for such a vector the abso- 
lute differential having geometrical meaning cannot be defined, it is possible 
to define it when the values of x’’‘ in the considered line element (x, x’) are 


= 
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given by any reason,f as we shall now show. Nevertheless it must be noticed 
that the covariant derivatives (1.22) of such a vector are surely homogeneous 
with regard to the x”). 

Since x’’‘ are known in the considered line element (x, x’), the values of F 
and xl‘ are also determined. For displacement from a line element (x, x’) to 
a neighboring line element (x+dx, x’+dx’) it is known that 6A ,«ll‘ and 
(5A ;/dt)6x’* are both scalars. They are varied by transformation of ¢ as fol- 
lows: 


do 
(6A = + (p — — ot Ade”, 


= — Ajox* — 'F de, 
i dt , dt 


where o = dt/di. Accordingly one gets 


(Asdxt + = + — 
where 
1 
(p — 1)(p — 3)F 
== 1 (p — 2)Anay + 
(p — 1)(p — 3)F 
since 


a pox"? = + “dt. 


1 
(p — 1)(p 
when #1, 3. Then one obtains as the required absolute differential 
(1.28) = + hvi(A;dxi + 
accompanied by the covariant derivatives 
h 


= + vig Ply Ax, 
- 1 — 3)F 
= Vivit+ pix + A; 
(p — 1)(p — 3)F \(p )A ice} 
since 
(1.30) = 0, — 2)Anciy + Aja} = 0. 


t For example, if one considers an absolute differential adjoint to a curve touching the line 
element (x, x’), one can take x’’' along the curve as the associated one. 
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F*/»§y‘ are independent of a change of the parameter ¢. For x’‘ equation 
(1.29) yields 

dt (p — 3)F 


If v‘ be defined in the manifold K,®) of line elements of the second order 
(x, x’, x’’) and go into ov‘ by transformation 


= 


do 


then it is very easy to derive 6v‘, which goes into o*év‘ for the same trans- 
formations: 


i i i j h ; 
(1.32) 60 = dv + log F. 


6. Displace a line element (x, x’) parallel to itself in the sense of the base 
connection on its direction. A curve is obtained, which satisfies the differential 
equations 


(1.33) = Q, 


that is, 
+ = 0. 
These differential equations define a system of paths in our space, but these 
paths are not extremal curves of the integral /F1/*dt, unlike those in the 
Finsler space. They are minimal curves, for along them F =0 holds. But it is 
to be noticed that minimal curves do not always satisfy (1.33). 
The equations in (1.33) in any parameter ¢ become 


+ = px’i, 


where p is a function of ¢. We shall refer to a parameter ¢ as affine length, if it 
makes p equal zero. It is determined except for a linear transformation 
i=at+b. If we confine ourselves to affine length, the discussion in §6 is 
superfluous. 
Considering the covariant derivatives of a vector v‘ along any path, we ob- 
tain from (1.22) 
dv' 


as 6x’'=(. Particularly it is known from (1.25) that 


| 

| 
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(1.35) VoA; = 0. 


The paths are formally the same ones as those in the space with a general- 
ized affine connection. There is, however, a difference in that in our space 
there is an intrinsic covariant vector A; besides the functions I‘, and our 
space is characterized by both quantities A; and I‘. We can develop the 
theory of curvatures, treat the equivalence problem, and so forth, by slight 
modifications of the methods employed in the geometry of generalized paths. 

7. Computing the parenthesis of Poisson for the covariant derivatives, we 
find the curvature tensors as follows: 


(ViVi — = —Ryrv + 


(1.36) 
(ViVi = — Burr, 


where we put 


Br = 
OG h ‘ h 
Ru = + Pa Tawl wa 
Ox Ox? 
h i h a 
+ — 
¢. i 
OT (3) OT 
Kin = - 
Ox* Oxi 


(1.37) 


h h 


It is not difficult to find the relations satisfied by these curvature tensors. 
They are 
Ria + Rin =0, + Raj + Rix =, 
(1.38) Ky =0, 
Rix Kin = vi Ki. 
The identities of Bianchi are now expressed by 
VaR je + Kaj Beye = 0, 
(1.39) 20 + Vi = 0, 
= 0. 


The invariant theory under transformation of the parameter ¢ can also be 
developed by the fundamental descriptive invariants of Douglas* 


* J. Douglas, loc. cit. 
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7+ 1 n+1 n+ 1 

(1.40) Gia = 

n—1 n—1 


where R*;;‘ denotes Rj formed with the II’s in place of the I’’s. 
8. The conditions 


(1.41) Bur =0, =0 


are necessary and sufficient in order that it may be possible by transformation 
of coordinates to reduce the differential equations of the paths in the space 
to the form x’’'=0, accordingly to reduce the curve length to 


s= f (A pdt. 


If moreover A ij)(%) =0, that is, if the A; are linear homogeneous func- 
tions of 


= Aj, Aun = — p=3. 


The A, ;) are functions only of the x’s. Then we have the curve length 


=f 


For the two-dimensional affine space, A i; are all constants. 
The conditions 


(1.42) =0, Rie =0 
are necessary and sufficient in order that there may exist a coordinate system 
together with a parametrization such that the differential equations of the 
paths have the form x’’'=0. 

For the equivalence problem we have without difficulty the theorem: 


THEOREM. A necessary and sufficient condition for the equivalence of two 
spaces is that there exist an integer N (21) such that the first N sets of following 
equations are compatible considered as equations for the determination of y‘ and 
P;' as functions of the independent variables x‘, and all their solutions 


yi = yi(x), Pi = 
satisfy the (N +1)st set of the equations: 
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A; = A;, 
1. P. P,P, = Pe. 


=V,A,, = VAs, 
Bj = Baw Po, 
h_j_k 


Furthermore normal coordinates can be defined, and the replacement 
theorem also can be proved. 
CHAPTER 2. THE CONNECTION C’ 
9. We shall now proceed to define another connection in our space. In a 
previous paper* the author has proved the theorem: 


Let ®; be a covariant tensor of order m, that is, dependent upon x’, 
++, then n quantities 


pi" = (*) 
p 


for a fixed positive integer p (Sm) are the components of a covariant vector, 
provided that v' is a contravariant vector. 


In consequence of this theorem one can deduce from a vector v‘ the follow- 
ing vectors: 


dvi 
2G; ; + + 


(2.1) — 


dv* 


(2.2) Di(E)v! 


2ayk 


(2.3) Di + UL + Ma) — 


where 


* A. Kawaguchi, Some intrinsic derivations in a generalized space, loc. cit. 
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(2.4) = *, 


——— M 
2(p — 1) 
(2.5) Hix = — Lin = + 

(2.6) Mix = — Aggy! + Ang + Aik — + Ati 
(2.7) P; = A; jx" ix"! + B;. 


It can be verified easily that the following relations hold: 


— Di (T)x"i = (p — 1)T;, 


(2.8) — = (p 17s, 
Di; = (p — 1)Ei. 
10. Suppose now once more that p¥3/2. Then we have from (2.1) an ab- 
solute derivative along a curve 


Since 
+ Tay) = + 2T*) + Mn; 


the absolute differential of a vector corresponding to a displacement from a 
line element (x, x’) to a neighboring line element (x+dx, x’+dx’) is obtained 
without difficulty. We have 


which reduces to (1.19), when 6x’/=0. 


The absolute differential (2.9) is, however, not invariant under transfor- 
mation of t, because 


(2.10) = (2p — 3)G"*Anay — 
does not vanish in general. When and only when 
(2.11) (p — 1)Aicey = Anew, 


(2.10) does vanish identically, and (2.9) is independent of transformation of ¢. 
11. If we adopt the notations 


lig = Anna + + (2 — 
Jus = + — 2)Ain; 
it will be easily inferred from (1.4) and (1.5) that 


(2.12) 


= 0, 
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Hence the tensor 


lk lk 
(2.13) = (2 — DG Nie 
satisfies the relations 
(2.14) 0, = 0. 
We shall now put 
1 

p 3, 

(2.15) (p — 3)? 


= C.;;, p= 3. 


The tensor C%, is constructed from only A; and is independent of B. With 
the help of the tensor C’,, an absolute differential can be defined 
which is invariant under transformation of the parameter ¢. The connection 
characterized by A;, I’, and C‘, will be referred to as the connection C’. The 
torsion* of this connection is C‘,, and for v'=x’' (2.16) gives us the base con- 
nection (1.20). 

12. The covariant derivatives referred to the connection C’ possess the 


form 
Ovi 


i dvi dvi k i k ~,i 
(2.17) Viv = + Cry, 
and we have 

‘ 
(ViVi — = — + KuVir, 
(2.18) (ViVi = — Burov +C.nVo, 
(ViVE = — — 
where 
= Ry + Kir 
Bra = + Cal — — + 
(2.19) 


i h h i 
— Tay + 
i 


eee h i h 


are the curvature tensors. 


* See Cartan, Les espaces de Finsler, loc. cit., p. 32. 
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Consider the differential forms 


dx, = dx’* + Mods’, 
then we have the equations of structure* of the connection C’: 
+ = — $Ruj [dx'de'] — Burj [dx'o'] 
The identities of Bianchi are now expressed by 


meee 


VpR + = 0, 
20 + Raje + = 0, 
(2.22) 
2V' mB + VePimg = 9, 
wP = 0. 
The equivalence theorem can be proved, but we shall forego the details 
here. 


* See Cartan, loc. cit., pp. 32-36. 
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ON THE REDUCTION OF DYNAMICAL SYSTEMS BY 
MEANS OF PARAMETRIZED INVARIANT 
RELATIONS* 


BY 
E. R. van KAMPEN AND AUREL WINTNER 


Introduction. The classical reduction theory of canonical systems with a 
degrees of freedom assumes that there is known a function group of first in- 
tegrals.t The reduction of the degree of freedom is then carried out by using 
the Hamilton-Jacobi theory or an equivalent approach. In the present paper 
a more general problem will be considered, since it will not be assumed that 
the known functions, or hypersurfaces in the phase-space, are represented by 
first integrals. In fact, the first integrals will be replaced by invariant rela- 
tions,{ so that, in particular, the known hypersurfaces need not form a con- 
tinuous family, but may be isolated. 

It should be mentioned that the generalization of the reduction problem 
to the case where the given first integrals are replaced by invariant relations 
is not an artificial problem but one which arises quite naturally in the sim- 
plest applications. If, for instance, one wants to reduce the degree of freedom 
of the problem of three bodies by means of the classical first integrals in an 
explicit and symmetrical manner, one is compelled to replace these first in- 
tegrals by certain of their combinations which form a complicated system of 
invariant relations and not a system of first integrals.§ 

The treatment of the general case of invariant relations will differ from 
the usual treatment of the case of first integrals in that all considerations will 
be based on a parametrization of the system of invariant relations, this 
parametrization being symmetrical with respect to the m coordinates and n 
impulses. Needless to say, the usual treatment of first integrals, based on 
the Hamilton-Jacobi theory, cannot be applied in the general case under con- 
sideration. 

In the particular case where the system of invariant relations is a system 
of first integrals (§10), the treatment of the general case goes over into a 


* Presented to the Society, April 16, 1938, under the title, The canoncical reduction of Hamilton 
systems; received by the editors August 2, 1937. 

+ A presentation of the classical reduction theory is given, for instance, by Engel [1]. 

t As to this notion, which is due to Jacobi and is fundamental tor the geometrical theories of 
Poincaré and Birkhoff, cf. Levi-Civita [4]. 

§ van Kampen and Wintner [3]. 
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reduction theory which seems to have essential formal advantages over the 
usual treatments of this particular case. 

In the limiting case where the system of invariant relations is empty, the 
considerations go over into a treatment of canonical transformations which 
has been recently developed* and which suggested the approach of the pres- 
ent paper to the reduction problem. 

§11, which is independent of the rest of the paper, attempts an application 
of the method of parametrization to the case where the given relations are 
not invariant systems but constraints. However, the usual rule for the intro- 
duction of constraints will be verified to be identical with the rule suggested 
by the method of parametrization only in the case where the constraints are 
holonomic. 

Notation. A matrix with k rows and / columns will be called a (&, /)-matrix, 
so that the product CD of a (k, /)-matrix C and an (/, m)-matrix D is a (k, m)- 
matrix. If C is a (k, /)-matrix, C’ will denote the (/, k)-matrix which is the 
transposed of C. A (k, 1)-matrix will be termed a k-vector, so that, if C and D 
are k-vectors, their scalar product may be represented as either of the two 
matrix products C’D, D’C. The (k, 1)-matrix in which all k/ elements are 0 
will be denoted by 0*,; and 0*;, by 0*. The letters k, r, - - - will denote scalar 
functions. Total differentiation with respect to the time ¢ will be marked by 
a dot, while ¢ as a subscript will refer to partial differentiation. If, for in- 
stance, c is a scalar function c(Z, ¢) of a k-vector Z and of t, and if Z is a func- 
tion of t, then 


= c.+Z’ gradz c, 


where the subscript of grad refers to the space in which one carries out the 
partial differentiations. The functions in question will be assumed to possess, 
in the regions under consideration, continuous derivatives of the uth order, 
where either = 1 or or =3; it will be always clear from the connection 
which of these three values is the actual value of yu. 

1. Parametrizations of invariant systems. For a fixed integer n=1, let G 
denote the (2m, 2n)-matrix 


O*, 
where E,, denotes the m-rowed unit matrix, so that 
(2) G=—-G = 


* Wintner [7], van Kampen and Wintner [2]. 
t As to the holonomic case, cf. Levi-Civita [5]. 
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The canonical system with » degrees of freedom and with the Hamiltonian 
function 


= Qny » Pn, 4), 
that is, the system 
oh oh 
A=1,---,%, 


may clearly be written in the form 


(3) GX = gradx h, 
where X denotes the 2n-vector 
(4) X = , Xan) 
defined by 

(5) h = h(X, 


For a fixed m, which is independent of (X, ¢) and is such that 
(6) m S 2n, 


let K be a (2n—m)-vector defined as a function K(X, #), in the (2n+1)- 
dimensional (X, #)-region under consideration, in such a way that 


(7) 2n — m = rank of gradxy K, 
and that the condition 
(8) K = K(X, t) = 0?" 


determines a non-empty set A(¢) in the 2m-dimensional X-space for every 
fixed ¢. Since gradx K is a (2n—m, 2n)-matrix, the assumption (7) is to the 
effect that A(¢) is a locally m-dimensional manifold for every fixed ¢t. Hence 
one can choose a 2n-vector 

(9) F=F(Y,?%), 


depending on / and on an m-vector 


(10) Y= Ym) 


— 
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in such a way that 

(11) X =F(Y, 2) 

is a local parametrization of the manifold A(¢), and that 
(12) m = rank of J, 

where J denotes the Jacobian (2, m)-matrix 

(13) J =J(Y, t) = grady F. 


If Y = Y(é) is an arbitrary curve in the space of the m parameters (10), the 
corresponding curve 


(14) X(t) = F(Y(, 
in the space of the 2m phase coordinates (4) is such that the relation 
(15) X=F,+JY 


is an identity in ¢. 

The system (8) of 2x—™m relations is called an invariant system of (3) if 
every solution X(¢) of (3) which is on the manifold A(é) for a single ¢ is on 
A(t) for every ¢, where the manifold A(é) is supposed to be a non-empty set. 
Any parametrization (11) of an invariant system (8) of (3) will also be called 
an invariant system of (3). An invariant system (8) of (3) does not determine 
the function (9) uniquely, since (12) is the only restriction as to the choice of 
the parameters (10). In what follows, the parametrization (11) of (8) will be 
considered as fixed. 

It is clear from (12) that if a given solution X(é) of (3) satisfies the in- 
variant system (8), then there exists in the space of the parameter vector Y 
exactly one path Y = Y(é) for which (14) is valid. According to (15) and (3), 
this unique Y(¢) is such that the relation 


(16) GJY = —GF,+ gradx h 


is, in virtue of (11), an identity in ¢. Now there exists for every point of the 
manifold A(t) exactly one solution X(é) of (3) passing through this point. 
Hence (16) and (12) imply that there belongs to the parametrization (11) of 
the invariant system (8) of (3) a unique m-vector 


(17) L = LY, 2) 


which is defined in the (m+1)-dimensional (Y, #)-region in such a way that 
the relation 


(18) GJL = — GF, + gradx h 
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is, in virtue of (11), an identity in Y and ¢. Since both Y(¢) and L(Y, #) are 
uniquely determined, comparison of (16) with (18) shows that 


(19) = L(Y, 2). 


Accordingly, X(#) is a solution of (3) satisfying the invariant system (8) of 
(3) if and only if the system (19) of m ordinary differential equations has a 
solution Y(t) by means of which X(¢) is representable in the form (14). It is 
clear, from the proof, that the two differential equations (19), (16) are equiva- 
lent, although (19) has, and (16) need not have, the normal form of ordinary 
differential equations. 

2. Some formal consequences of the properties of G. On defining an m- 
vector R by 


(20.1) R = R(Y, t) = 3J'GF 
and a scalar r by 
(20. 2) r=r(Y,?t) = 3F/GF, 


one will be able to show that the matrix function J’GJ of (Y, #) may be repre- 
sented in the (Y, #)-region under consideration as the alternating derivative 
(curl) 


(21.1) J'GJ = (grady R) — (grady R)’, 
while the vector function JGF, of (Y, ¢) appears in the form 
(21.2) J'GF, = R, — grady r. 
In fact, (21.1) and (21.2) are the identities 

2J'GJ = grady (J'GF) — (grady (J'GF))’ 
and 

2J'GF, = (J'GF). — (grady (F/GF))’ 


which follow from (2) and (13), since the second derivatives cancel. 
Since the Jacobian (13) is a (2, m)-matrix, it is clear from (1) that J’GJ 
is a skew-symmetric (m, m)-matrix, so that 


(22) 21 = rank of J’GJ 
for some integer / which satisfies the inequality 


(23) 21s m 


and will be supposed to be independent of (Y, ¢) in the (m+1)-dimensional 
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(Y, #)-region under consideration. It may be mentioned that (6) and (25) can 
be replaced by the sharper statement that 


(23 bis) 2. 
While the definition (23) of the integer / assumes, in view of (13), the choice 
of a parametrization (11) of the invariant system (8), it will be shown that / 


is uniquely determined by the invariant system (8) alone. This is implied by 
the relation 


(24) 2m — rank of J‘'GJ = 2n — rank of NGN’, 
where N denotes the (2x —m, 2n)-matrix 
(25) N = N(X, #) = gradx K, 


so that NGN’ is, in view of (1), a skew-symmetric (2” — m, 2n —m)-matrix. In 
fact, since (11) is a parametrization of (8), K(F(Y, #), #) =0*"-™ is an identity 
in (Y, #). On differentiating this identity with respect to the m components 
of Y, one sees from (26) and (13) that 


(26.1) Rj a 


is an identity in (Y, ¢) in virtue of (11). Furthermore, it is seen from (12) that 
there exists an (m, 2n)-matrix T = 7(Y, t) such that 


(26.2) TJ = Em, 


where E,, is the m-rowed unit matrix. Now it is clear from (7), (25), (26.1) and 
(26.2) that the (2m, 2m)-matrix 
(;) 
T 


has the rank 2m; that is, it has a non-vanishing determinant; and its reciprocal 
matrix is, by (26.1) and (26.2), of the form 


where W=W(Y, #) is some (2m, 2n —m)-matrix. Hence (2) implies that the 
two (2n, 2n)-matrices 


N\ .(N\' (NGN’ NGT’ 
ror) 


JGI 


and 
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are reciprocal matrices. Now a well known theorem states that, since H, and 
H, are reciprocal matrices, the difference of order and rank in NGN’ is the 
same as in J’GJ. This proves (24). A direct proof of (24) follows by multiplica- 


tion of H,; and the matrix 
yond 
J'GJ 
= O( a) 


(27.1) [u,v] = = — [u,v] 


The Lagrange bracket 


of a 2n-vector 
(a1,- dn) = A = A(u, 
which depends on two scalar parameters u, 7, may be written in the form 
(27.2) [u,v] = AJGA,, 
so that the well known identity 
(27.3) [u, + [v, + [w, wu]. = 0, 


which holds for an arbitrary 2u-vector A =A(u, v, w) depending on three 
scalar parameters, may be written as 


(27.4) (A »)w + (AvGAw)u + (AwGAx), = 0. 
Similarly, the Poisson parenthesis 
Afi, 
(28.1) (fi, fe) = (fe, fi) 


O(Xn4a, Xa) 
of two scalars 
fi=f(X,0, = 
may be written as 


(28.2) (fi, fe) = (gradx f:)'G(gradx fe). 


It may be mentioned that the matrix function J’GJ of (Y, #) is independ- 
ent of ¢ if and only if the vector function J’GF, of (Y, ¢) is the Y-gradient of 
a scalar function of (Y, #). In fact, (J’GJ), is the alternating derivative 
(curl) of J’GF, with respect to Y. In order to verify this, it is sufficient to 
replace in (27.4) the variables A, u, », w by F, ¢ and two components of Y, 
respectively. 
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3. Separating parametrizations of invariant systems. It will be conven- 
ient to split the m-vector (10), with the use of the integer (22) satisfying (23), 
into the 2/-vector. 


(29.1) = (yi,-- , yer) 
and the (m—2/)-vector 
(29.2) Y* = » Ym), 
so that 

y? 
(29.3) @ = ). 


The limiting case 
(29 bis) 2l=m, thatis, Y=Y°, 


will not be excluded. 

The parametrization (11) of (8) will be called a separating parametrization 
if there exists a (2/, 2/)-matrix S°=S°(Y, #) by means of which the (m, m)- 
matrix (21.1) may be represented in the form 


07! 
(30) = ( ). 
It is clear from (22) and from the skew-symmetry of (21.1) that 
(31) det S° # 0 
and 
(32) S° = — (S°’, 


whenever S° exists. 

It will be shown that if S°=S°(Y, ¢) exists, that is, if (11) is a separating 
parametrization of the invariant system (8) of (3), then there exist a 2/-vec- 
tor 


(33.1) R° = RY®, 2) 
and a scalar 
(33.2) h® = h(Y®, t), 


such that the (2/, 2/)-matrix S° occurring in (30) may be represented as the 
alternating derivative 


(34.1) S°(Y°, 1) = S° = (gradyo R®) — (gradyo R°)’, 
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while the m-vector (21.2) appears in the form 


R°®, — gradyo h°® 
(34.2) = grad b+ ( 
where 
(35) h=h(Y;t) = A(F(Y, 


cf. (5), (11). This implies, in particular, that in case of a separating para- 
metrization the functions 


J'GF, — grady h 
of ¢ and of the m-vector (29.3) are both independent of the (m— 2/)-vector 
(29.2). In the proof use will be made of the identity 
(36) J’ gradx h = grady h, 


which is obvious from (11), (13), and (35). 
First, it is clear that the vector function R is determined by the conditions 
(21.1), (21.2) only up to an additive term of the form 


(37) grady b, 
where b=5(Y, #) is an arbitrary scalar function. If (37) is added to R, then 


the term 6, has to be added to r in order to satisfy (21.2). On the other hand, 
on placing 


38 
( ) = R* ’ 


so that R°= R°(Y, ¢) and R* =R*(Y, are the 2/-vector and the (m—2I)- 
vector formed by the first 2/ and the last m—2/ components of the m-vector 
R, one can see from (29.3) that (21.1) implies the pair of identities 


(39. 1) (grady° R°) - (grady° R*)’ = 07! 
and 
(39.2) (grady- R*) — (grady- R*)’ = 0”-*!,,_91, 


if the condition (30) for a separating parametrization is satisfied. Since (39.2) 
is precisely the integrability condition for the existence of a scalar 


(40) b = t) = Y*, 2) 


such that 
(41) R* = grady: b 
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is an identity in (Y, ¢)=(Y°, Y*, #), and since (38) is undetermined up to an 
additive term of the form (37), it is clear that, on modifying the functions 
R and r in a suitable manner, one can assume without loss of generality 
that the function (40) is identically zero. Then R* =0"-*! by (41); hence 


R® 
(42) R= 
by (38), and (39.1) goes over into 
(42 bis) grady- R° = 0?!,—-21- 


Now it is seen from (42 bis) and (29.3) that the function R®= R°(Y, #) is of 
the form (33.1). 

In order to prove the existence of the function (33.2), notice first that on 
multiplying (18) by —.7’ and using (36), one obtains 
(43) — JIGIL = J'GF, — grady h. 
Since substitution of (42) into (21.2) gives 

R®, — gradyo h 
(44) J'GF, — grady k -( ), 
— grady- (r + h) 


it is seen from (30) and (43) that 
(45) Om?! = grady- (r + h). 


Now (45) means that the function r+h of (Y, t)=(Y°, Y%*, #) is independent 
of Y*; that is, that the scalar function h° defined by 


(46) W=r+th 


is of the form (33.2). Finally, substitution of (45) and (46) into (44) gives 
(34.2), while (34.1) follows from (30) if one substitutes (33.1) and (42) into 
(21.1). 

The theorem thus proved and the last remark in §2 imply that in the 
separating case characterized by (30) the function (34.1) of (Y®, ¢) is in- 
dependent of ¢ if and only if the function (33.1) is also, or, more precisely, that 


(47) = 074, if and only if R®, = 


holds for a suitable choice of the function (33.1). 

4. The separated differential equations belonging to a separating para- 
metrization of an invariant system. The name “separating parametrization” 
will now be justified by showing that if the parametrization (11) of an invari- 
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ant system (8) of (3) is a separating parametrization, then the first 2/ of the 
m differential equations represented by (19) appear in the form 
(48) = L(Y®, 


that is, as a system of 2/ differential equations which is separated from the 
m — 2/1 differential equations determining the components (29.2) of the m-vec- 
tor (29.3) as functions of ¢. The latter differential equations are, in view of 
(19) and (29.3), of the form 


(49) y* = L*(Y*, Y®, 2). 


Thus if there is known a solution Y°(t) of (48), the corresponding functions 
Y*(¢) are the solutions of 


(50) y* = L*(Y*, 2), 
where 
(50 bis) L*(Y*, t) = L*(Y*, Y%), 2) 


(cf. (49)) is a known function of ¢ and of the (m—2l)-vector (29.2), so that 
(19) splits into the two separated systems (48), (50). Finally, the m-vector 
Y(t), determined by Y°(¢) and Y*(é) in virtue of (29.3), gives the correspond- 
ing solution (14) of (3). The system (48) of order (22) will be referred to as 
the separated system belonging to the separating parametrization (11) of (8). 

In order to prove the possibility of splitting (19) into (48) and (50), notice 
first that (19) is, as pointed out at the end of §1, equivalent to the system 
(16). Now, if (16) is multiplied by J’, the first 2/ of the resulting m differential 
equations appear, in view of (30), (34.2), (36), and (29.3), as 


(51) = — R, + gradyo h°. 


This completes the proof, since (33.1), (33.2), (34.1), and (31) imply that (51) 
can be written in the form (48). Needless to say, (48) is in the limiting case 
(29 bis) identical with the original system (19). The result just proved con- 
tains, however, information for this limiting case also. 

In fact, whether (29 bis) is or is not satisfied, the result proved above may 
be described by saying that if the parametrization (11) of an invariant system 
(8) of (3) is a separating parametrization, then the 2/-vector (29.2) is deter- 
mined by a system of 2/ differential equations which form a non-degenerate 
Pfaffian dynamical system with 1 degrees of freedom.f In order to see this, it 

{ The term Pfaffian dynamical system is used in the sense of Birkhoff. Such a system is called 
non-degenerate if it can be solved with respect to the time derivatives, that is, if the skew-symmetric 


matrix, represented by the curl of the vector potential involved, does not vanish in the domain of the 
phase space which is under consideration. 
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is sufficient to notice that on the one hand (34.1) and (51) are equivalent to 
(52) (gradyo R°) — (gradyo R®)'}y° = — + gradyo h®, 


and that on the other hand (52) clearly is the system of Euler-Lagrange equa- 
tions which belongs to the calculus of variations problem 


(52 bis) f {(¥)’R(Y°, t) + h(Y®, t)} dt = 0, 


where the ends are fixed and the problem is non-degenerate in view of (31) 
and (34.1). 

5. Existence of separating parametrizations of an invariant system. The 
considerations of §3 and §4 assumed that there is given a separating para- 
metrization (11) of the invariant system (8) of (3). Actually, one can always 
choose the parametrization of an invariant system as a separating parametri- 
zation. This existence theorem can be inferred from a classical theorem in the 
theory of Pfaffians as follows. 

First, (21.1), (21.2), (36), and (16) hold for arbitrary parametrizations of 
(8). Let (11) be a given parametrization. Since from (16) and (36) 


(53) J'GIV + J'GF, — grady h = 0", 
it is seen from (21.1) and (21.2) that 
(54) \(grady R) — (grady R)’}¥ + R, — grady (r + h) = 0", 


where R, r, and / are the functions (20.1), (20.2), and (35). If one denotes 
by w the (m+1)-ary Pfaffian 

(55) w = + (r + 

(cf. (10)), it is clear that (53) is the system of Euler-Lagrange equations be- 
longing to 


(56) = sf { [R(Y, #)]’d¥ + [r(¥, + ACY, t)]dt} = 0, 


where the ends are fixed. Notice that (56), in contrast with (52 bis), need not 
be non-degenerate, since the determinant of the skew-symmetric (m, m)- 
matrix (21.1) vanishes identically unless (29 bis) is satisfied. 

Let G® denote the (2/, 2/)-matrix 


obtained from (1), (2) by replacing m by /, where / is defined by (22) and satis- 


hid 
| 
| 
3 
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fies (23 bis). It is well knownj{ that on subjecting the m-dimensional Y-space 
to a suitable non-singular local point transformation for every fixed ¢, and 
omitting then an additive complete differential, one can transform the 
Pfaffian (55) into the (2/+1)-ary normal form 


(58) — + hdt, 

it being understood that in (58) the notation (29.3) is applied to the new 
m-vector Y and that 

(59) h® = h°(Y®, Y*, #) 

is some scalar function. Since a variation problem (56) is covariant under 
point transformations, and since a complete differential can always be 
omitted beneath the signs 5/, where the ends are fixed, it is clear from (59) 


and from the normal form (58) of (55) that (56) appears in terms of the new 
m-vector Y in the form 


(60) f {— + h(Y°, Y*, t)dt} = 0. 


Now, (57) being a constant matrix, it is seen from (29.3) that the Euler- 
Lagrange equations belonging to (60) are 


On substituting (29.3) into (57) and comparing the result with (61), one ob- 
tains 


G° 
(62) = ( ). 


Consequently, the condition (30), which characterizes the separating para- 
metrizations, is satisfied, S® being the matrix (57). This completes the 
proof of the existence theorem stated at the beginning of this section. 

Let it be mentioned for application in §6 that the function (59) occurring 
in (61) can be chosen, in view of the results of §3, as a function of the form 
(33.2). In other words, one can write (61) in the form 


(63) G°Y® = gradyo h® 
in view of grady- h®=0"—*!. 


+ The normal form usually given in textbooks is not (58) but another Pfaffian. However, the 
latter differs from the Pfaffian (58) only in a complete differential; cf., e.g., van Kampen and Wintner 
[2], p. 862. 
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6. Canonical parametrizations of an invariant system. A given parametri- 
zation (11) of a system (8) of m invariant relations of a canonical system (3) 
with degrees of freedom will be called a canonical parametrization if it is a 
separating parametrization (§3), and the corresponding separated system of 
2/ differential equations, that is, the system (48), is a canonical system with / 
degrees of freedom, so that (52) is of the particular form (63). Thus the result 
found at the end of §5 may be expressed by saying that the invariant system 
(8) of (3) always admits a parametrization which is a canonical parametriza- 
tion. 

The construction of a canonical parametrization of a given invariant sys- 
tem depends on the construction of a point transformation which transforms 
the Pfaffian (55) into its normal form (58). And the construction of such a 
point transformation is known to require the solution of a complete system 
of partial differential equations, a solution which, in general, cannot be ob- 
tained by mere quadratures. Hence, if one does not want to refer to the 
existence theorem of complete systems, which depends on successive approxi- 
mations or on equivalent processes, one has to assume that the point trans- 
formation in question is a priori known. This is the situation in many 
applications where the geometrical or dynamical connection suggests the 
“correct” choice of the parameters (10). Actually, it is easy to see from the 
considerations of §5 that the point transformation which transforms the 
Pfaffian (55) into its normal form (58) can be chosen such as to depend only 
on the function (8) and not on the function (5). In other words, (8) admits a 
parametrization (11) which is a fixed canonical parametrization for all those 
canonical systems (3), with m degrees of freedom, for which the fixed system 
(8) of m relations is an invariant system. 

It will be assumed in what follows that both (8) and (3) are fixed. The 
considerations of §5 clearly imply that if a given separating parametrization 
(11) of (8) is such that (30) is of the particular form (62), that is, if 


(64) 


holds for the constant (2/, 2/)-matrix (57) and for the matrix (34.1) occurring 
in the definition (30) of a separating parametrization, then the given separat- 
ing parametrization is a canonical parametrization. 

If a canonical parametrization (11) of an invariant system (8) of (3) is 
such that, for some constant m-vector C, the Yo-gradient of the Hamiltonian 
function (33.2) of the separated system (63) is the 0”-vector for every ¢, then 


yt) =C 


i 
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clearly is a solution of (63). For this solution of equilibrium, (50) goes over, in 
view of (50), into 


(65) y* = L*(¥Y*,C, 2), 


so that 


is, for every solution Y*(¢) of (65), a solution of (19). The solutions X (¢) of (3) 
obtained from (66.1) by means of (14) are, in the main, the stationary solu- 
tions of Routh and Levi-Civita.f 

7. Completely canonical parametrizations. A canonical parametrization 
(11) of an invariant system (8) of (3) will be said to be completely canonical 
if the Hamiltonian function (33.2) of the separated canonical system (63) 
may be obtained by direct substitution of (11) and (29.3) into the Hamil- 
tonian function (5) of the original canonical system (3), so that 


(67) = h(Y°, t) = K(F(Y®, Y*, 2), 2). 


This implies, of course, that /(F(Y°, Y*, ¢), ¢) is a function of Y° and ¢ alone, 
that is, that 


(67 bis) grady- h(F(Y°, Y*, 4), 2) 


If a canonical parametrization (11) of (8) does not contain ¢, that is, if 
F,=0, then the parametrization is completely canonical. In order to prove 
this, notice first that if F,=0", then, as seen from §3, the functions R and 
r of Y and ¢ are independent of ¢. Hence it is sufficient to prove that r is 
independent of Y. In fact, r is then a constant and can, therefore, be omitted 
in (46). Now grady r=0” follows from (21.2) in view of R,=0" and F,=0**. 

Remark. On comparing the fact thus proved with the considerations of 
§5, it is seen that every invariant system (8) which does not contain ¢ ex- 
plicitly admits a completely canonical parametrization (the Hamiltonian 
function (5) need not be independent of #). 

As an application of completely canonical parametrizations, a generaliza- 
tion of the well known case of ignorable coordinates will now be considered. 

If a coordinate x, =q (cf. (4), (4 bis)) does not occur in the Hamiltonian 
function (5) of (3), the corresponding impulse «,,4, = p, clearly is independent 
of ¢ for every solution X(#) of (3). Thus the case of an ignorable coordinate is 
the case of first integrals (cf. $10) of the form x, = const. and is implied by the 
more general case where x, = g, does occur in (5) but x,4, =p, has a constant 


t Cf. Levi-Civita [4]. 
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value Cn, for some solutions X(#) of (3), so that 4.4, =Cn4, is not a first inte- 
gral but merely an invariant relation. This case of coordinates which are ignor- 
able in a generalized sense is again a particular case of the one where a system 
(8) of 2n—m invariant relations is given as consisting on the one hand of 
n—I relations of the form 


(68.1) Xntrh = Cntr, A=/1+1,---,n, 
and on the other hand of a system of m—n-+ relations of the type 
(68.2) Xp = ful %, Xn41,° °° Xen), l + 1, n, 


where / denotes some integer satisfying (23 bis). The last assumption implies 
that 


(68 bis) Oslin and 


and that all the x’s occurring in (68.1) are impulses / in view of (4 bis). The 
case (68.1), (68.2) of (8) is of importance in certain applications} and can be 
parametrized in a completely canonical way as follows: 

First, a parametrization (11) of the invariant system (8) represented by 
the m relations (68.1) and (68.2) may be obtained by choosing as the m com- 
ponents y of the parameter vector (10) those m components «x; of the 2n-vec- 
tor (4) for which either 1<ix<m-—T1 or nSi<n+l. If one numerates the m 
components of this Y in a suitable manner, the parametrization (11) of the 
invariant system (68.1), (68.2) appears in the form (11), if one adjoins the 
relations 


(68.3) = Viy = Yi, = = 


to (68.1), (68.2). Hence it is easily verified from (1) and (13) that the integer / 
occurring in (68.1), (68.2), (68.3) is the same as the one defined by (22), and 
that the matrix J’GJ is, in view of (57), precisely the matrix (62). Since (62) 
is satisfied, the parametrization is canonical; that it is completely canonical 
follows from the fact that ¢ does not occur in (68.1), (68.2), (68.3), that is, in 
(11). 

Accordingly, the reduction of (3) to the separated canonical system (63) 
is, in the case of an invariant system of the form (68.1), (68.2), identical with 
the reduction of (3) in the classical case of first integrals represented by ignor- 
able coordinates. In this particular case, the function (50 bis) is, in view of 
(68.3), independent of Y*; hence Y* = Y*(é) follows from (50) by quadrature, 
while the system (50) determining Y* cannot be solved always by quadra- 
tures in the more general case (68.1), (68.2), (68.3). 


¢ Cf. van Kampen and Wintner [3], pp. 155-156 and pp. 164-166. 
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8. Canonical parametrization of holonomic invariant systems. In this 
section the number / will be introduced as a given integer for which 


(69) O<i<n, 

but it will turn out that / is identical with the number defined by (22), while 
the number m defined by (7) will become 

(70) m=n-+l1. 


of the m components of the 2m-vector (4) which are represented in (4 bis) by 
the n-vector P; that is, if (8) is of the form 


(71) U(Q, t) = 
where 
(72) U = 


is an (n—/)-vector depending on the n-vector Q, defined by (4 bis) and possi- 
bly on ¢. It is clear from (4 bis) that in this holonomic case (7) does over 
into 


(73) n — 1 = rank of J, 
where J denotes the Jacobian (n —/, m)-matrix 
(74) I = IQ, t) = gradg U, 


m being the number (70). According to (70), one can write the parameter 
vector (10) in the form 


Q 
(75) Y=(P }, 


where Q and P are /-vectors and P* is an (n—/)-vector. Furthermore, it is 
seen from (73) that (71) admits by means of an n-vector 


(76) A = AQ, t) 
a parametrization of the form 

(77) 0 =4@,4), 
where 


(78) I = rank of gradg A. 


The invariant system (8) of (3) will be called holonomttete ndependent 
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This clearly implies the existence of (/, m)-matrices 


(79) H = HQ, 2) 
for which 
(80) H gradg A = E, 


(where £, is the unit matrix) is an identity. 
Now it will be shown that if an invariant system (8) of (3) is holonomic, 
then it admits a parametrization (11) which is canonical and of the form 


where the m-vector Y is written in the form (75), the n-vector A is arbitrarily 
chosen such that (77) gives a parametrization of (71) satisfying (78), and H 
is any matrix satisfying (80); finally the matrix J defined by (74) is thought 
of as expressed by means of (77) as a function of (Q, ?). 

In order to prove this, notice first that substitution of (77) into (71) gives 
an identity in (Q, #). On differentiating this identity with respect to the com- 
ponents of the /-vector Q, one obtains 


(82) I gradg A = 0", 
in view of (74). Now (80), (82), and (73) clearly imply that the (m, m)-matrix 


has the rank 7; that is, it has a non-vanishing determinant. Hence it is clear 
from (75) that, since (77) is a parametrization of the restriction (71) for Q, 
(81) is a parametrization of the invariant system (8) represented by (71) and 
(4 bis). It remains to be shown that this parametrization of the holonomic 
invariant system is canonical. 

First, the (2n, n+/)-matrix J defined by (13) is, according to (81) and 
(75), 


( gradg A 0", yess 
gradg (H’/P + 1'P*) H 


Hence it is seen from (1), (57) and (80), (82) that the matrix product J’GJ 
satisfies the criterion (62) of a canonical parametrization if and only if the 
(1, /)-matrix 


(gradg (H’P + I'P*))’ grado A — (gradg A)’ gradg (H’P + I’ P*) 
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is identically zero, that is, if and only if the (/, /)-matrix 
D = (gradg A)’ gradg (H’P + I'P*) 


is symmetric. Now if one brings the factor (gradg A)’ of D beneath the grad 
sign of the second factor of D, one clearly obtains 


D = gradg {(gradg A)’(H’P + I’P*)} + correction term, 


where the correction term, being an iterated gradient, is a symmetric matrix. 
Hence it is sufficient to show that 


gradg {(gradg A)’(H’P + I’P*)} 
is a symmetric matrix. Now this matrix is 
gradg {(H gradg A)’/P + (I gradg A)P*} = gradg P 


in view of (80) and (82); hence it is the zero matrix, since P is independent of 


Q (cf. (75)). 


9. Canonical parametrizations of semi-holonomic invariant systems. The 
two types of invariant systems treated below under (i) and (ii) are important 
in some applications.t Each of these two invariant systems consists of two 
subsystems one of which is independent of the impulses while the other is 
linear in the impulses. One of the two subsystems, when considered without 
the other, need not be an invariant system. The number / will be introduced, 
in both cases (i), (ii), as a given integer satisfying (69); but it turns out that / 
is identical with the number / defined by (22), while (70) is replaced by 


(84) 


The difference between the “semi-holonomic” cases (i), (ii) and the holonomic 
case of §8 depends, in the main, on the fact that in the semi-holonomic cases 
the (7 —1)-vector P* is missing in (75), so that the m-vector (10) is, in accord- 
ance with (84), of the form 


(85) y=(5). 


where Q and P are /-vectors. 
Case (i). Let 


(86.1) U = U(Q, 2), 
(86.2) V =V@Q,2) 


Ol 


t Cf. van Kampen and Wintner [3], pp. 154-155, and the applications given in [3]. 
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be a pair of functions such that U is an (w—/)-vector, V is an /-vector and, if J 
and H denote the Jacobian (m —1, m)- and (J, )-matrices 

(87.1) I = 1(Q, t) = gradg U, 

(87.2) H = H(Q,#) = grado V, 

the determinant of the (”, )-matrix (83) is distinct from zero, which clearly 
implies that 


(88.1) n — 1 = rank of I, 
(88.2) 1 = rank of H. 
Hence there exist —/, m)-matrices 

(89) M = M(Q, 2) 
such that 

(90) HM’ = 0,1 


is an identity in (Q, #) and 
(91) n—I 


rank of M. 


The non-vanishing of the determinant of the (n, m)-matrix (83) also implies 
that, if O denotes the /-vector 


(92) 0 = V@,1), 
there exists a unique m-vector (76) such that (78) is satisfied and (77) gives a 
parametrization (71) in such a way that (92) is an identity in (Q, ¢) in virtue 
of (77). 

Now suppose that an invariant system (8) of (3) is given by the pair of 
conditions 


(93.1) U 

(93.2) MP 

where U, M are given functions (86.1), (89) which satisfy, for some function 
' (86.2), the conditions described above, and Q, P are the n-vectors defined by 


(4 bis). It will be shown that the invariant system (8) represented by (93.1), 
(93.2) admits a parametrization (11) which is canonical and of the form 


x= 10.0 = (4) 


where the m-vector Y is written in the form (85) by means of two /-vectors 


ay 
ere 
| 
| 
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QO, P, (m= 21); the unique -vector (76) defined before by means of (92) is such 
that (77) gives a parametrization of (93.1); the matrix H defined by (87.2) 
is thought of as expressed by means of (77) as function of (Q, ¢); and finally 
the inversion formula belonging to (93) is 


V 


in virtue of the invariant system. 

In fact, the first row of (95) and the first row of (94) are satisfied by the 
definitions (92) and (77) of Q and A. Furthermore, it is seen from the defini- 
tion (87.2) of H that (80) is an identity in (Q, é) in virtue of (77). Now (80), 
(90), and (91) imply that the (, 2)-matrix 


(gradg A)’ 
96) 


has the rank n, that is, has a non-vanishing determinant. Hence the system 


(97) ( P ) 
M 


of linear equations has, for any /-vector P, a unique solution P; and this P is 
(97 bis) P = H'P 


in view of (80) and (90). Hence, on considering the first row of (97) as the 
definition of P, the second row of (94) follows from (93.2). This proves that 
the invariant system (8) represented by (93.1), (93.2) admits the parametri- 
zation (94) and satisfies (95). In order to prove that this parametrization is a 
canonical parametrization, that is, that the (2/, 2/)-matrix J'GJ, where J is 
defined by (13) and (94), satisfies the identity (62), one merely has to repeat, 
with obvious modifications, the verification carried out in detail at the end 
of §8. 

Case (ii). Suppose that an invariant system (8) of (3) is given by a pair 
of conditions 


(98.1) U = 0", 
(98.2) MP = 0", 


where Q, P are the n-vectors defined by (4 bis), and 
(99.1) U = U(Q,?%), 
(99. 2) M = 
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(U is an (n—1)-vector, M is an (n—l, n)-matrix), and 
(100) n—1=rank IM’, 


where J denotes the (x —1, n)-matrix (74). The existence of a function (86.2), 
occurring in Case (i), is not assumed in the present Case (ii). Assumption 
(100) clearly implies (74) and (91), while (73) implies the existence of m-vec- 
tors (76) which satisfy (78) and are such that (77) gives a parametrization 
(98.1) in terms of an /-vector OQ. 

It will be shown that the invariant system (8) represented by (98.1), 
(98.2) admits, for every choice of the function (76), a parametrization (11) 
which is canonical and of the form (94), (85), where the /-vector P is such that 


(101) P = (gradg A)'P, 


while H =H(Q, 2) is an (J, )-matrix which is uniquely determined by the 
choice of (76) and satisfies (80) and (90). 

In order to prove this, notice first that (82) is an identity in virtue of (77), 
the reason being the same as in §8. Now (82), (77), and (100) imply that the 
(n, n)-matrix (96) has the rank n, that is, it has a non-vanishing determinant. 
Hence there exists a unique (J, )-matrix H = H(Q, 4) which satisfies (80) and 
(90). Let (101) be considered as the definition of an /-vector P. Since the de- 
terminant of (96) is distinct from zero, the linear system (97) has a unique 
solution P; and this P is given by (97 bis) in view of (80) and (90). This proves 
the second row of (94), while the first row of (94) is identical with (77). 
Finally, (98.2) and (101) are satisfied in view of (97). This proves that (98) 
is a parametrization (11) of the invariant system (8) represented by (98.1), 
(98.2). That this parametrization is a canonical parametrization, is proved in 
the same way as in Case (i). 

10. The case of first integrals. The results of the preceding sections con- 
cern the general case of an invariant system (8) of (3). In what follows, it 
will be assumed that the relation 


(102) K(X, t) =C, 


where C is a constant (2n—m)-vector, is an invariant system of (3) not only 
for a single choice of the constant vector C but for all choices of C in some 
(2n —m)-dimensional] region of the C-space. This is the case if and only if 
each of the 2x—™m scalar functions which constitute the components of the 
vector function K = K(X, #) are first integrals of (3). Since a scalar function 
g=g(X, ¢) represents a first integral of (3) if and only if g=0 is an identity in 
virtue of (3) alone, it is sufficient to prove that if (102) is an invariant system 
for every C, then K =02"-” is an identity in virtue of (3) alone. Now if (102) 


t 
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is an invariant system for every C, then K =0"— is an identity in virtue of 
(3) and (102) together, and so, C being arbitrary, it is an identity in virtue 
of (3) alone. 

In the particular case where the invariant system consists of first integrals, 
the integer 2/ defined by (22) or (24) has a simple meaning, since it is con- 
nected with an integer occurring in the classical theory of reduction of canoni- 
cal systems by means of first integrals. In fact, it turns out that if the 2n —m 
first integrals represented by (102), which are independent in view of (7), 
form a function group in the sense of Lie,* then this function group contains 
n—I1l, but not more, independent first integrals in involution. In view of the 
well known extension process of Poisson parentheses, the assumption that the 
2n—m first integrals form a function group does not imply any loss of gen- 


erality. 
Since, on placing 
(103) 2k = rank of NGN’, where WN = gradx K, 


one can write (24), in view of (22), in the form 
(104) s—-k=m—l, 


the statement to be proved may be formulated as follows: If (102) consists 
of 2n—™m first integrals of a canonical system (3) with m degrees of freedom, 
and if these first integrals are independent and form a function group, then 


(105.1) j=n—m-—k, 


where & is the integer defined by (103) and 7 denotes the maximum number 
of first integrals in involution which are contained in the function group. 

First, NGN’ is a skew-symmetric (2n—m, 2n—m)-matrix in which the 7 
elements contained in the first 7 rows and first 7 columns vanish, if the first 7 
components of the (27 —m)-vector K(x, t) are independent first integrals in 
involution. Hence it is clear from (103) that 


(105.2) 2k 2(2n—m-—j), thatis, 2n—m—k. 
In order to complete the proof of (105.1), it remains to be shown that 
(105.3) j2n—m—k. 


Now (105.3) is implied by a known existence theorem concerning functions 
in involution. t 


* Cf., e.g., the presentation of Engel [1]. 
t The presentation of the local existence theorems by Engel [1] assumes, but actually does not 
use at all, that the systems are analytic. 
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Remark. There may be mentioned an essential difference between the case 
(8) of invariant systems and the case (102) of first integrals (in involution). 
In the case of first integrals one can solve (50) by mere quadratures. In other 
words, the separation of (19) into (48) and (50) reduces the degree of freedom 
from to / in case of first integrals. This is a well known consequence of the 
Hamilton-Jacobi theory (Lie). On the other hand, the solution of (50) can 
require operations higher than quadratures in case of invariant systems. 

Application to the problem of three bodies. In order to illustrate the use 
of (104) and (105.1), consider the non-planar case of the problem of three 
bodies. Let £;, n:, ¢:, where i=1, 2, 3, be Cartesian coordinates of the mass p; 
in an inertial coordinate system, and let =;, H;, Z; denote the conjugated im- 
pulses piéi, wii, Then =9 and (3) admits, besides the energy integral, 
the nine classical first integrals represented by the functions 


3 


h=> {Hei f= fs = dO {Zn — 


t=] i=1 i=1 


3 


3 3 
DE, DW, DZ; 


i=1 t=] t=1 
3 3 3 
i=1 t=1 
Let the (27 —m)-vector K occurring in (105.1) be defined as the 9-vector with 
the components fi, - - - , fs, so that m=9. Thus (104) and (105.1) go over into 


(106) l=k and j=9-k, 


Now it will be easy to verify from (103) that k=4, so that ]=4 andj =5 are, 
respectively, the degree of freedom of the separated system (63) and the maxi- 
mum number of first integrals in involution contained in the function group 
generated by fi, - , fo. 

In order to prove that k =4, one has to show in view of (103) and (25.1) 
that the 9-rowed skew-symmetric matrix 


fos fe)ll; 1, 9, 


of Poisson parentheses has the rank 8 if the values of the coordinates £é;, n:, ¢% 
and the impulses %;, H;, Z; do not have exceptional values. Now it is easily 
verified from the above definition of fi, - - - , fy that 


(fi, fe) = fs, (fe, far) = — fe, (fs, fa) = fs, 
3 
(fe, fz) = — fo, (fs, fz) = fe, (fa fr) = Domi, 


i=l 


| 
| 
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and that these 6 identities remain valid under simultaneous cyclic permuta- 
tions of the three triples (1, 2, 3), (4, 5, 6), (7, 8, 9); and finally that those 
9?— 2.18 =45 Poisson parentheses which occur neither among these eighteen 


nor among their negatives (f., f,)= —(f,, f.), are identically zero. In other 
words, the skew-symmetric (9, 9)-matrix || (f,, f.)|| may be written in the form 
(Fy 
3 
F; 0 wE 
(107) fo» fe)|| i=1 
3 
Fs — 0 
i=1 


where 0 and £ denote the 3-rowed zero and unit matrices, respectively, and 
F; is the three-rowed skew-symmetric matrix 


0 fi — 


(108) F; = Sai 0 Ssi-2 1 1, 
— 0 
It is easily seen from (108) and from the above representation of fi, - - - , fs 


that, if the eighteen coordinates and impulses have values which do not lie 
on an algebraical manifold of a dimension number less than eighteen, the 
9-rowed skew-symmetric matrix (107) is of rank 8; that is, k =4. 

If one considers the planar solutions of the problem of three bodies, one 
can assume that the three ¢; and the three Z; are identically zero, and there 
remain only the five integrals fs, fi, fs, fr, fs. Thus n=6 and 2n—m=5S, so 
that (104) and (105.1) go over into 


(109) l=k+1,j=5-k. 


Since the 5-rowed skew-symmetric matrix which takes the place of the 9- 
rowed matrix || (f,, f,)|| is found to be of rank 4 by the above method, it fol- 
lows that k =2, and therefore, by (109), that ]=3 and j =3. 

11. Forced paths. So far it has been assumed that (8) is an invariant sys- 
tem of (3). In the present section it will only be assumed that (8) satisfies (7) 
and determines a non-empty manifold A(é); that is, that (8) is compatible 
with itself, without necessarily being compatible with (3); so that (8) plays 
the role of a system of 2n—m constraints which modify the equations of mo- 
tion (3). These constraints, in view of (4 bis), are not necessarily holonomic 
and are not given in the usual way, that is, not in terms of the coordinates, 
velocities, and time. It can be assumed, in view of (7), that the manifold (8) 
is represented by means of a parameter vector (10) in the form (11), (12). 
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Denote by 
(110) Y = V(t) 


any curve in the Y-space, with the time ¢ as parameter, and let the curve in 
the X-space, 


(111) X = X(), 


which is defined by (110) and (14), be called the image curve of (110). Put, 
along this image path, 


(112) Z =Z(t) = GX — gradx h, 


so that the 2n-vector Z(é) is identically zero if and only if the image curve of 
(110) represents a solution of (3). The (2m, m)-matrix (13) is a function of ¢ 
along the curve (110). Now if a curve (111) in the X-space is such that it is 
the image curve of a curve (110) for which 


(113) J'Z = 0" 


is an identity in /, then (111) will be termed a forced path belonging to the 
Hamiltonian function (5) and to the constraint system (8), since it is clear 
that whether a path (111) isa forced path or not is independent of the choice 
of the parametrization (11) of (8). It is seen from the identities (15), (36) and 
from the definitions (112) and (113) that a path (111) is a forced path belong- 
ing to (5) and (8) if and only if the differential condition 


(114) J'GIY = — J/(GF, + grady h, 


where h is defined by (35), has a solution (110) by means of which (111) is 
representable in the form (14). Since the skew-symmetric (m, m)-matrix 
J'GJ, defined in the (m+1)-dimensional (Y, ¢)-region under consideration, 
can have a rank (22) which is less than m, the system (114) cannot always be 
written in the normal form (19) of ordinary differential equations. In other 
words, one has general existence and uniqueness theorems for the initial prob- 
lem of (114) only when accidentally m = 21. 

In what follows, only Hamiltonian functions (5) and constraints (8) will 
be considered for which the initial problem of (114) has not only an existence 
but also a uniqueness theorem in the region under consideration (to this end, 
the assumption m= 21 is sufficient but not necessary). Then the dynamical 
meaning of forced paths is indicated by the following remark: If one assumes 
that the 2m-vector (112) is a force, then the orthogonality condition (113) de- 
fining a forced path is nothing but d’Alembert’s principle; which expresses the 


eh 

i 

: 
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fact that the forces Z, which modify the equations (3) of free motion in view 
of the constraint (8), do not work. 

It is hard to verify the general validity of this interpretation. The inter- 
pretation is certainly valid in case (8) is not a constraint but an invariant 
system, since in this case (16) is valid and (114) follows from (36). Further- 
more, the remark is valid in case the constraints are holonomic, since in this 
case (113) will be seen to be identical with the differential equations usually 
derived from d’Alembert’s principle by means of Lagrangian multipliers. 
Whether or not the assumption (113) is dynamically correct in case of non- 
holonomic constraints, is a difficult, if not meaningless, question. In fact, 
non-holonomic constraints are usually assigned in terms of velocities and not 
of impulses, while X in (8) does contain impulses in view of (4 bis). Now, in 
case P actually occurs in (8), the passage from the impulse vector P to the 
velocity vector Q cannot be carried out before solving (113). 

It remains to be verified that in the case of holonomic constraints which 
can contain ¢ explicitly, the requirement (103) of a forced path leads to the 
same paths as d’Alembert’s principle; that is, in the holonomic case the forced 
paths are identical with the paths described under the constraints which are 
required by the usual rules of dynamics. First, if (8) depends only on Q and ¢ 
and not on P, the equations of motion belonging to the Hamiltonian function 
(3) and to the system (8) of 27 —™m constraints are 


Q = gradp h, — P = gradg h + (grade K)’d, 
where \ is a (2n—m)-vector representing Lagrangian multipliers. This may 
be written, in view of (4 bis) and (1), in the form 


‘ o” 
GX = gradx h+ (gradx ny("), 


since gradp K =0"2,-m. On multiplying by J’ and using the definition (112) 
of Z, one obtains 


Oo” 
= J'(gradx ny (,). 


This relation is identical with the equation (113) of forced paths, since 


(gradx K)J = 


is, according to (13), an identity in (Y, ¢) in virtue of (11). 

Needless to say, the consideration of forced paths in case of holonomic 
constraints gives for d’Alembert’s principle a formulation which is invariant 
under canonical transformations of the 2n-dimensional phase-space. 
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AN ISOMORPHISM BETWEEN LINEAR RECURRING 
SEQUENCES AND ALGEBRAIC RINGS* 


BY 
MARSHALL HALL 


I. INTRODUCTION 


1. The Thirteenth Century was in but its second year when Fibonacci 
(or Pisano) proposed a problem on the number of offspring of a pair of rab- 
bits, whose solution led to the sequence of numbers now named after him. 
There is reason to believe that Fermat derived many of his arithmetic theo- 
rems from a knowledge of recurrences and, certainly, his celebrated Last 
Theorem may be stated as a problem on sequences. Lucas was the first to 
make any extended researches on sequences, establishing a great many prop- 
erties of certain second order sequences. Carmichael [1]f{ in 1920 made the 
first attack on sequences in general, and established their fundamental prop- 
erty of modular periodicity. 

This paper undertakes a general survey of the modular properties of linear 
recurring sequences, beginning from the results of a paper by H.T. Engstrom 
and two by Morgan Ward.§ No problems on sequences are considered here 
which are not strictly modular, though questions on divisibility sequences] 
and their remarkable factorization properties are closely related. 

The mechanism which the author uses for examining the properties of se- 
quences is an isomorphism between the set of all sequences satisfying a fixed 
recurrence and a polynomial ring of operators. The isomorphism is not with 
the abstract ring but with a particular realization of it, and this is not es- 
pecially surprising as a linear sequence is essentially an exponential function. 
This isomorphism may be derived from the theory of generating functions, 
and includes the fundamental identity used in Ward [11 ]. 

In Chapter II the isomorphism is set up and the basic properties of the 
ring are examined and their interpretation is given for the sequences. For ex- 
ample, the zero divisors of the ring correspond to the sequences which satisfy 
recurrences of lower order. 


* Presented to the Society, September 3, 1936; received by the editors August 27, 1937. 
{ For the early bibliography see Dickson’s History of the Theory of Numbers. 

¢ Numbers in square brackets refer to the bibliography at the end of this paper. 

§ Engstrom [3] and Ward [11], [12]. 

|| Lucas [7], Lehmer [6], Hall [4], Ward [14]. 
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Chapter III considers in some detail the periods modulus p‘,(j=1, 2, - - -), 
of a fixed sequence and of all sequences. The structure of possible periods is 
thoroughly investigated and a method is given for its complete determina- 
tion. The period patterns are shown to be dependent upon the ideal structure 
(or lattice) of the ring. 

Chapter IV gives a similar dissection of the numeric patterns of sequences, 
and their relation to null sequences. Necessary and sufficient conditions that 
a sequence be null modulo m are found, and a very elegant criterion for 
“p-adically null” sequences is given. 

Chapter V treats some questions on the distribution of modular residues 
in sequences. It is shown that the internal modular structure of sequences is 
intimately bound up with residual groups. A diophantine equation on dis- 
tribution numbers which Ward found for third order sequences is shown to 
be one of a family of equations, and it is also shown that similar equations hold 
for sequences of any order. 

Whereas Chapters III and IV would seem to exhaust the possibilities of 
the problems considered, Chapter V is only the introduction to some ex- 
tremely recondite questions. The author will consider these further in another 
paper. 

II. THE ISOMORPHISM 


2. Consider a linear recurrence of kth order 


(2.1) = + + 
in which ay, d2, - - - , a; are rational integers. Two operations on sequences 
(v,) satisfying (2.1) may be defined: 

I. Sum: 


(tn) + (Wn) = (Yn + Wn). 
II. Scalar product: 


= (tvn), any rational integer. 


In addition an operator x is defined: 

III. = (041). 
These three operations may be combined to yield a ring R(x) of polynomial 
operators on the sequences, under which the sequences satisfying (2.1) are 
closed. This ring is easily shown to be associative, commutative, and dis- 
tributive. 

The characteristic polynomial of the recurrence (2.1), f(x) =x*—ax*" 
— --+-+ —a,, has the property that 


(2.2) S(x)(on) = (0) 
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for all sequences (v,) satisfying (2.1). We may set up an isomorphism between 
the sequences satisfying (2.1) and the ring R(x) by the correspondences: 
Primary isomorphism: 


(2.3) 1 = (wna), h(x) = h(x)(wn), 


where (w,) is the sequence defined by wo=wi= --- wir=1, 
which shall be called the unit sequence. It is a simple matter to verify that 
(2.3) actually defines a one-to-one correspondence between the sequences 
satisfying (2.1) and the ring R(x) of polynomials modulo f(x). In fact 


(2.4) (vn) V(x) = + (01 — avo) + + — — 


for any sequence (v,). This polynomial V(x) is the (unique) polynomial of 
degree less than k in the residue class modulo f(x) corresponding to (v,) by 
the isomorphism (2.3). We note in passing that the sequence (z,) is integral] 
if and only if the coefficients of the corresponding canonical polynomial V (x) 
are integral. 

It is sometimes convenient to use a secondary form of the isomorphism in 
considering single terms of the sequence (9,). 

Secondary isomorphism: 


(2.5) 


x'V (x) > 9;. 


The secondary isomorphism is simply a correspondence between an element 
h(x) of the ring R(x) and the coefficient of x*~! in the canonical representative 
of the residue class h(x) modulo f(x). The relations of (2.5) are justified by 
the definition III of the operator x and by (2.4). 

3. If f(x) is reducible, then R(x) contains zero divisors, and if f(x) has 
multiple factors, then R(x) contains nilpotent elements. Hence the theory of 
fields may not be applied to R(x) although a number of theorems on fields 
hold true. Again, R(x) is not in general a maximal order, and the arithmetic 
of algebras is not directly applicable to the arithmetic of R(x). But we may 
apply many theorems on algebraic number fields to R(x) without reproving 
them because of the following theorem: 


THEOREM 3.1, PRESERVATION OF IDENTITIES. Jf - - - , ax) is poly- 
nomial in a,--~- , @ with rational integral coefficients and if F =0 whenever 
, are rational integers such that f(x) =x*—ayx*-!— - - - —a, is irre- 


ducible, then F =0. 


V(x) 00, 
xV(x) 1, 
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If we take f(x) arbitrary modulo # and irreducible modulo q, then F =0. 
Hence F=0 (mod ?) for a, - - - , a, arbitrary modulo any p and F=0. 
If we define the norm of (v,)<2V(x) by N(v,) =N[V(x) ], then we have 


Vo, Ve-1 

(3.1) N (- 1) V2, U3,°° * 


Similarly for the spur (or trace) 
It is easily verified that the norm of the unit sequence is unity and that its 


trace is k. 
A classical result of interest here is the following: 


THEOREM OF KRONECKER. A necessary and sufficient condition that (v,) 
satisfy a recurrence of lower order than (2.1) is that N(v,) =0. 


Hence, with respect to the recurrence of lowest order which (v,) satisfies, 
N(v,) ¥0. 

Another rational integer associated with a sequence (,) is its container. 
For some purposes this is more useful than the norm. 


DEFINITION. If (v,)=—g(x) then the container of (vn) is the least positive ra- 
tional integer which g(x) divides. 


Since the rational integers which g(x) divides form a modulus, they are 
all multiples of the container. There is a very close relationship between the 
container and the norm given by the following theorem: 


THEOREM 3.2. Precisely the same primes divide the container and the norm. 


If mis the container of (7,) g(x), then h(x)g(x) =m. Hence N(h(x)) N(g(x)) 
=N(m)=m* and so N(v,)|m*. But from the definition of the container 
m| N(v,). Hence every prime dividing m divides N(v,) and conversely. 


III. PERIOD PATTERNS 
4. If for a fixed 7 and all n=n, 
(4. 1) Untr = Un (mod m), 


then the least 7 is said to be the period* of (w,) modulo m and the least mo 


* The notation and terminology of this paper are in agreement with Ward [11] for the most part. 
I use “period” following Engstrom rather than Ward’s “characteristic number.” 
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the numeric. The principal results on periods and numerics are given in 
Carmichael [1], Engstrom [3], and Ward [11]. Here we shall not consider 
periods individually. We shall study the period patterns of sequences satisfy- 
ing (2.1), where two sequences are said to have the same period pattern 
modulo if their periods modulo p’, (j =1, 2,3, - - - ), are the same. 

Following Ward [11], we reduce the determination of periods to the solu- 
tion for 7; of congruences 


(4.2) (xt* — 1)g(x) = 0 (modd Fi(x)), 

where (u,,) g(x), and 

(4.3) = Fi(x)Fo(x) - - (mod p’), Fi(x) = hi(x)* (mod p), 
the /;(x) being irreducible and distinct modulo p.* 


THEOREM 4.1. The partial period 7; of (u,)=g(x) modulo p‘ is the exponent 
to which x belongs modulo the ideal A; of all y satisfying yg(x) =0 (modd pi, 
F,(x)). 

This theorem and its proof are obvious, but it seems desirable to empha- 
size from the start the relation of periods to ideal theory. 

As we shall confine our attention to a single component F;(x), the sub- 
script 7 will be omitted hereafter. Moreover we may whenever desirable re- 
strict ourselves to ordinary sequences for which g(x) #0 (modd p, F(x)). If 
(u,) is not ordinary, then g(x) =p*k(x) (modd p/, F(x)), where (2,)—k(x) is 
ordinary, and the period of (u,,) modulo p/ is the period of (v,) modulo pi’. 

If \ is the exponent to which x belongs modulo , A(x),f that is, the least 
solution of 


(4.4) x” = 1 (modd A(x)), 
and if r is determined by 
(4.5) 


then y=’) is the principal period of (2.1) modulo p (Ward [11], Theorem 
10.4). The principal period (by definition the period of the unit sequence) 
modulo ? is a fortiori a period of any sequence modulo p. But no writer has 
considered the possibility that a sequence may have a period which is a 
proper divisor of v. This can indeed happen and we shall later see that such 
periods appear in period patterns in a special role. 


* This is the well known Schénemann decomposition. The F;(x) are unique and p-adically de- 
terminate. 

t Information on the value of ) is given by a law of higher reciprocity due to F. K. Schmidt [9], 
pp. 165-166. For a particularly simple proof of this relation see O. Ore [8], pp. 272-273. 
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THEOREM 4.2. The period of any ordinary sequence modulo pi is of the form 
pr. 

Since p/!y = p’ti-) is a general period modulo p/ (Engstrom [3], p. 217), 
it is sufficient to show that the period of an ordinary (u,) is a multiple of X. 
If the ideal A; of Theorem 4.1 is the unit ideal, then g(x) =0 (p’, F(x)) and 
(u,) is not ordinary. Otherwise, since A ; contains the primary ideal [p‘, F(x) J, 
A; must be contained in the prime ideal [p, h(x) |. Hence we must have 


(4.6) x’ = 1 (modd f, h(x)), 


and 7 is a multiple of X. 

Now v may be the principal period not only modulo , but also modulo p’, 
where the greatest o will be called the defect.* Again, for a particular sequence 
(tn) g(x), v may be a period not only modulo 7 but also modulo p*+?, where 
p is called the initial defect as it depends on the initial values. The values of p 
and o determine all periods which are multiples of v as given in the following: 


THEOREM 4.3. If p° #2, and vis a period of (u,) modulo pt, then the period 
of (un) modulo pi for] >pt+o is pi-°-*v; when p’ =2, if 2v is a period modulo 
exactly 2°, then the period of (un) modulo 2 is 2'-*+" for j >s. 

For the method of proof see Ward [11], pp. 619-620. The statement of 
his Theorem 11.1 is inaccurate as it does not consider the possibility of periods 
which are proper divisors of v. Note that v is not said to be the period of (u,) 
modulo p+ but merely one of the periods. Examples can be given in which 
the period of (w,) modulo p+? is a proper divisor of v. 

Example. =3tn41—Un With =1, =2, =4. Here v=6 is a period 
of (u,) modulo 3 but not modulo 9. The period of (u,,) modulo 3 is, however, 2. 

5. From Theorem 4.2 the periods of (u,) modulo p/ are among the num- 
bers \, pA, -- -. Let p*” be the highest power of for which p*d is a 
period of (u,). Then the period of (u,) modulo p/ is p*, where 7 is the least 
number for which j(z)=7. And from Theorem 4.3 the period pattern of (u,) 
is completely determined given j(0), - - - , j(r). We may represent a period 
pattern graphically if, in the cartesian plane, we plot the points (7, j(z)) and 
connect successive points with straight lines. The resulting graph will be 
called the period path of (un). 

In congruences (modd p’, F(x)), the partial container plays the role of the 
container in R(x). The partial container is zero or the least positive rational 


* Ward [11], p. 622, gives some information which shows that the defect can be greater than 
unity only in certain rare cases. The determination of the defect is the generalization of the (un- 
solved) problem of the Fermat quotients (a?-!—1)/p. 


> 
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number which g(x) divides modulo p/, F(x). Evidently the partial container 
must be with 6, <6 if p* is the highest power of dividing the container 
of g(x). It may be shown without difficulty that the exponent 6, is the same 
whatever the value of 7. 


THEOREM 5.1. If p** is the partial container of x”"*—1 (modd pi, F(x)) then 
pd is a period of some ordinary (v,) modulo p**, but of no ordinary (un) modulo 
pi for 7 > dn. 

Let h(x) (modd pi, F(x)). If h(x)=0 (p, F(x)), then 
h(x) = pk(x) (modd F(x) and R(x) (x°—1) = (pi, F(2)) and 
is not the partial container of x”"*—1 contrary to the assumption. Hence 
h(x) 40 (p, F(x)) and (v,)—h(x) is ordinary. Moreover p*d is a period of 
(v,) modulo p** since h(x)(x?"\—1)=0 (p*4, F(x)). Suppose that pd is a 
period of some (#,)—g(x) modulo p**+!, Then g(x) —1) =0 F(x)) 
and so g(x)h(x)(x»”*—1) =0 (p54+!, F(x)) or g(x) p**=0 F(x)), whence 
g(x) =0 (p, F(x)) and (u,) is not ordinary. 

If f(x) has a root of unity among its roots, the container of some x"—1 
is zero and this theorem is without content in the form here given. In §6 it 
will be shown in what way the theory may be modified to cover this case. 

The complete set of (ordinary) period paths modulus 3 belonging to the 
recurrence 


(5.1) = — — + — Un 


is given here. 
There are six different period paths: 


(0,0,0,1,---)=1 passing through (0, 0); (1, 0); (2, 1), 

(i, —1, —2,11,---)a®41 (0, 1); (1, 1); (2, 2), 

(1, -1, —2,8,---)@x2-2 (0, 1); (1, 2); (2, 3), 

(1,2, -5, (0, 1); (1, 3); (2, 4), 

(1, -—1,1,8,---)2#+3xe+4+1 (0, 2); (1, 2); (2, 3). 
Here, as f(x) =(x+1)* (mod 3), we have \=2, r=2, and v=18. Theorem 4.3 
shows that the period paths to the right of the line whose abscissa r corre- 
sponds to », are straight lines of inclination 1/4. 

The period polygon (dotted in the following figure) is defined as that 
polygon bounded on the left by the line x =0, on the right by x =7, above by 
the line segments passing through the points (7, 5,) in order, and below by 
the segments joining (0, 0), (r—1, 0), and (7, ). Theorems 4.3 and 5.1 dis- 
pose of all period paths except those parts of ordinary period paths lying 
within the period polygon. 
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Principle of addition of paths. If the path of g(x) passes through (i, j:) and 
the path of k(x) passes through (i, j2), then the path of g(x)+k(x) passes through 
(i, 7’), where j’ =min (ji, j2) when and j’ when ji 


Proof. Suppose ji Then (x?**—1)g(x) =0 (pi, F(x)) 40 (pat!, F(x)) 
and (x?**—1)k(x) =0 (p#, F(x)). Hence (x»**—1)(g(x) +h(x)) =0 (pi, F(x)) 
= («?"*—1)g(x) 40 (p*t!, F(x)). For 7: the proof is obvious. 

All A(x) which satisfy A (x) (x?**—1) =0 (p/, F(x)) form an ideal A;,;. The 
path of an element belonging to A;,; wil] pass through or above (i, 7). The 
path of (w,) shows clearly to which A;,; g(x) belongs and to which it does not 
belong. 

An ideal is said to be ordinary if it contains an ordinary element. 


Inclusion principle. If A is an ordinary ideal and is not included in the 
ideal B, then A contains an ordinary element which is not contained in B. 


Proof. Choose an ordinary h(x) from A. If h(x) is in B, then we choose a 
k(x) from A which is not in B. If k(x) is ordinary, it satisfies our requirements. 
If k(x) is not ordinary, then h(x) +(x) is ordinary and is in A but not in B. 


| 
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Since the period of (w,) modulo p*+! is a multiple of the period modulo p/, 
any period path must remain horizontal or go upward as we move to the 
right. We shall find further properties of the paths and give some existence 
theorems. 


THEOREM 5.2. There is an ordinary path passing through an arbitrary point 
of the period polygon. 

The lower boundary of the period polygon is the period path of the unit 
sequence. By Theorem 5.1 there is an ordinary path through an arbitrary 
point of the upper boundary. Note that as in the example of (5.1) the upper 
boundary is not necessarily a period path itself. If a point is s units above the 
lower boundary, then (u,,)—p* passes through this point, and if we add this 
to the ordinary path passing through the point on the upper boundary im- 
mediately above it, we find an ordinary path passing through this point. 


COROLLARY. A 1s properly contained in A ;,;. 


By actual calculation h(x)*-**] for i=0,---, r—1. Also 
pih(x)*-”' is in A;,; but not in Aj1,;; hence A;4,; is properly included 
in A;,;. Using these simple facts, the principle of inclusion, and the principle 
of addition of paths, we may prove the existence of a variety of paths. 


THEOREM 5.3. There is an ordinary path through the following points of the 
period polygon: (a) (i—1, 7), (i, 7), and 7) provided that the last is not on 
the right boundary; (b) (¢«—1, 7—1), (i, 7), and (¢+1, 7) if all are interior points; 


(c) (¢—1, 7), (¢,7), and (¢+1, 7+1) if all are interior points. (d) There is a path 
going up from every point on the lower boundary. (e) There is a horizontal path 
to the right from every point on the left or upper boundary. ({) There is an upward 
path from every point on the left boundary except possibly at the upper corner. 
(g) There is a path coming from below to every point on the right boundary, with 
one possible exception when p* =2, and [p, h(x) |=[2, x+1]. 


The method of proof is the same in all cases, and only (g) will be proved 
here. Let the ordinary path (v,) through (r, 6,) pass through (r—1, #). The 
unit sequence is ordinary and passes through (r—1, 0) and (r, o). If 
(u,)=p**, <j <6,), then (u,+2,) is ordinary and passes through (r, 7) 
and through (r—1, if through (r—1, #) if >t. Hence there 
is an exception only if j7—o =¢ and (u,+2,) passes through (r—1, 7) and (r, /), 
that is, (r—1, t+e) and (r, t+o). If o >1, then passes through 
(r—1, t+o—1) and (r, t+20—1); whence (u,+2,+wn) passes through 
(r—1, t+o0—1) and (r—1, t+o). If o=1, it may happen that (u,+2,+Wn) 
passes through (r—1, t+oa—1) and (r, s) with s >t+o0=i+1. We now have 
an ordinary path (y,)—y(x) through (r—1, +1) and (r, ¢+1) and another 
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(z,)<22(x) through (r—1, #) and (r, s) with s >¢+1 and no path to (r, +1) 
from below. (yna+2,) passes through (r—1, ¢) and (r, +1) but may not be 
ordinary. If it is not ordinary, take c40, 1 (p, h(x)); then the path of 
(x) +c2(x) is ordinary and passes through (r —1, ¢) and (7, +1) as we wished. 
It is impossible to find such a c only if [p, #(x) ] =[2, +1]. In this case there 
may be an exception. 

Example. If F(x) =x?+2x+5, p=2, there is no ordinary path from below 
to (1, 3). 

6. When f(x) has one of its roots a root of unity, Theorem 5.1 may be 
without content, but all period paths may nevertheless be predicted by a 
finite process if we modify of the theory as in the following example: 


(6.1) = — + — + 


Here f(x) and if we take 
p=3, then \=2 and v=18. 


(x? — 1)(5x° + 9x? — 64+ 17) = — 9, 
(6.2) — 1)(x? + 2% — 2) = 0, 
(x18 — 1)(a? + 2x — 2) = 0. 


Hence 2 may be a period modulo 1, 3, or 9, but 6 may be a period modulo 
3‘ for any i. Let (w,)=g(x) be any sequence satisfying (6.1). Then we may 
write g(x) =g:(x)(x?+2x%—2)+ g(x), where go(x) =ax+0. If (u,) does not sat- 
isfy a recurrence of order lower than (6.1), then by the Theorem of Kron- 
ecker, go(x) may not vanish since N [g,(x) (x? 2x—2) | =0. 

Let ge(x) = 3%g3(x), where g;(x) #0 (mod 3). Then (x* — 1)g(x) 
= — 1)gs(x) or — 1)g(x) = 3*(a? — x + 1) (at + — — 1)g3(x) 
(mod Hence we may reduce («*+2*—x—1)g;3(x) 
modulo x*+2x—2, that is, to (—11x+7)g;(x), and as the partial container 
of —11x+7 (modd 3’, x*+2x%—2) is 3, we find that (%*—1)g(x) is divisible 
either by 3‘ or 3*+1, Similarly the partial container of («'*—1)/(«%®?—x+1) 
(modd 3’, 2?+2x—2) is 9. We may summarize these results by saying that 2 
is a period modulo 1, 3, or 9; 6 is a period modulo 3‘ or 3‘*+!; and 18 is a period 
modulo 3''! or 3*+*. For periods greater than 18, we may apply Theorem 4.3. 

The methods used in this example are perfectly general and may be ap- 
plied to any case in which the characteristic has roots of unity. 


THEOREM 6.1. If the greatest common divisor of x”**—1 and F(x) is r(x), 
then pd is a period of (un)—g(x) modulo (i <t<i+6,), where g(x) =p‘gi(x) 
(modd p/, F(x)/r(x)) and p* is the partial container of (x?"*—1)/r(x) (modd p’, 
F(x)/r(x)). 
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7. If p?=2, we have not only the possibility of an exception to Theorem 
5.3, but Theorem 4.3 is applicable only to periods greater than 2y. Closer 
study of this case yields some interesting results. 

In this case we will have 


(7.1) x” — 1 = 2M(x) (modd 2’, F(x)), 
where M(x) 40 (2, F(x)). By direct calculation we obtain 
(7.2) — 1=4M(x)[M(x) +1] (24, F(x)). 


If g(x)M (x) [M(x)+1]=2'S(x) F(x)), then 2» is a period of (u,)<g(x) 
modulo 2‘+*. But of the two quantities M(x), M(x) +1, one or the other must 
be relatively prime to the modulus. Suppose M(x) +1 is relatively prime to 
the modulus. Then g(x)M(x) =2'S(x)/(M(x) +1) (27, F(x)), and 
from (7.1) g(x)(%*—1)=2'*S,(x) (2%, F(x)), whence v is a period of (u,) 
modulo 2‘+!, In this case the period paths to the right of x =r (corresponding 
to period v) are straight lines of inclination 7/4, that is, Theorem 4.3 may be 
applied for all periods greater than v, as in the case of paths when p’ #2. 

Suppose, on the other hand, that M(x) +1 is not relatively prime to the 
modulus. In this case M(x) is relatively prime and its partial container is 1. 
Hence from (7.1) every ordinary period path goes through (r, 1). Let 
M(x)+1=2*T(x), T(x) 40 (2, F(x)), and let 2 be the partial container of 
T(x) (modd 2‘, F(x)). Then it is easily shown by previous methods that there 
are ordinary paths passing through (r+1, 7) with 2+«<i<2+«+4. An ex- 
ample of this is given by F(x) =a*+3a?+4x—3. Here v=6, 


(7.3) x®& — 1 = 2(— 2x3 + 6x? + 6% — 5) = 2M(x). 


Here M(x)+1=2T7(x), where the partial container of T(x) is 2. The graph 
of the period paths has a curious “elbow” in it. 


IV. NUMERICS AND NULL SEQUENCES 


8. The numeric of a sequence (u,) modulo m is the least integer m(m) for 
which u,4,=u, (mod m), n=n(m), where 7 is the period of (u,) modulo m. 
The null index (if it exists) of a sequence (%,) modulo m is the least integer 
m(m) for which u,=0 (mod m), m=m(m) and is of course also the numeric. 
Theorem 4.1 of Ward [11] shows that the existence of a sequence with a pre- 
scribed numeric implies the existence of another sequence with the same null 
index. This theorem may be extended to include a prescribed set of numerics 
for a finite number of moduli. 

Let f(x) =F (x)G(x) (mod where F(x) =x* (mod p) and G(x) 40 x). 
This is the Schénemann decomposition of f(x) with the factors F(x) - - - F,(x) 
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combined into the single term G(x). The congruence for the numeric of (x,) 
modulo is 


(8.1) x"g(x) = 0 (modd Pp’, F(x)). 


THEOREM 8.1. A necessary and sufficient condition that (un)<—g(x) be null 
modulo p: is that g(x) =0 (modd p/, G(x)). 

This theorem is given in Ward [11], pp. 613-614. 

Let us call a sequence p-adically null if it is null modulo #/ for all j. 


THEOREM 8.2. If p® is the highest power of p dividing the container of 
(ttn) g(x), then (un) cannot be null modulo pi for j >6 unless G(x) =1, in which 
case (un) is p-adically null. 


Let g(x)h(x)=p'r (mod f(x)) where r40 (mod 9). Then a fortiori 
g(x)h(x) =p*r (modd G(x)) and hence g(x) 40 G(x)) for 7 >6 unless 
G(x) =1, in which case g(x) =0 (modd p/, G(x)) for all . 

A less precise but more striking way of stating the same theorem is the 
following: 


THEOREM 8.3. If (2.1) is the recurrence of lowest order which (u,) satisfies, 
then (u,) is p-adically null if and only if p divides all of a1, - - + , dx. 


It is possible to graph numeric paths of sequences in a way similar to that 
in which period paths were graphed. If » is a numeric of (u,) modulo p/ but 
not modulo p‘+!, we plot the point (m, 7). The broken line joining these points 
for n=0, 1, 2,--- is the numeric path of (u,). There is an analogue to 
Theorem 5.1 for numeric paths, but not to Theorem 4.3. We note that a se- 
quence is ordinary [g(x) 40 (p, F(x))] if it is not purely periodic modulo p. 


THEOREM 8.4. The numeric path of the unit sequence is the lower boundary 
of all numeric paths, and the segments joining the points (i, 5;), where p* is 
the partial container of x‘ modulo pi, F(x), form the upper boundary of ordinary 
numeric paths. 

The proof is straightforward and parallels that of Theorem 5.1. The por- 


tion of the plane included between the lower and upper boundaries of ordi- 
nary numeric paths will be called the numeric sector. 


THEOREM 8.5. Let F(x) =x*+byxe-1 + - - - and let be the highest 
power of p dividing b;. If we define a=a;/i=min (a, a2/2, +--+ , a/e), then 
a point on the lower boundary of the numeric sector is either on or within a 
distance ea; below the ray through the origin whose slope is a. If B=8;/j 
=min (a4, (a-2ta.)/2,---, (e—1)a./e), then any point on the upper 
boundary is either on or within a distance eB; above the ray through the origin 
whose slope is a.—B. 
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We may define a commutative algebra (x) over the field of all algebraic 
numbers by taking 1, x, - - - , x*-' as basis elements and putting F(x) =0. 
The element y= /p~*x is integral in the sense of Deuring [2] (p. 68) if y is a 
rational number less than or equal to a, but not for any y greater than a. 
Hence z= p~*ix'=y' satisfies an equation 


(8.2) 


of degree not greater than e, where the c’s are rational since z is a rational 
function of x, and are integral since z is an integral element of %(x). Not all 
c’s are divisible by , for if they were, then p-'/*z would be an integral element 
of %(x) and so also would be some p~*x with y >a. 

If (w,) is the unit sequence satisfying a recurrence of characteristic F(x), 
then by the secondary isomorphism (2.5) 


(8.3) —> Unity. 
Hence since p*‘z =x‘, we may write 
(8.4) x? — Unity. 
The elements (x*z"), n=0, 1,2, - - - and p fixed as one of 0,1, ---,z—1, 
satisfy the recurrence 
(8.5) Unte =< €Wate-1 


Hence the leading coefficients in their minimal residues modulo F(x) must 
also satisfy the same recurrence. But from (8.4) these are the sequence 
(tnisp/p"*), (n=0, 1, - - - ). Now the coefficients of (8.5) are rational in- 
tegers and hence the denominators of a sequence satisfying this recurrence 
cannot exceed the denominators of the initial values. Hence the denominator 


of any one of the sequences (u:+,/p"**), (oe =0, 1, - - -,i—1), cannot exceed 
a fortiori Hence = p"**-0n*, where is in- 
tegral and o<(e—1)a;. As this relation holds for p=0, - - - , i—1, we have 


anite=(0) (p=i-°), Hence the lower boundary passes through or above the 
points (ni+p, na;—a), and as p<i, o<(e—1)a;, these points are not more 
than ea; below the ray through the origin with slope a=a;/i. On the other 
hand, no point of the lower boundary can lie above this ray, for if x" =0 (p*) 
with j >na, then p-‘x" is an integral element of (x) and so also is (p~ix")!/" 
or p-‘/"x. But p-*x cannot be an integral element of U(x) with y >a. 

From Theorem 8.4 to determine the upper boundary it is necessary only 
to find the partial container of x" (modd p/, F(x)). We have 


(8.6) x(x! + dat? + = — = pwd 
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where ~/b. We may write this 
(8.7) xw = pb, 


Hence if w" = p‘w with p/w, then p"*—* is the partial container of x. It re- 
mains to find the greatest power of p dividing w”. But w=b,/x satisfies the 
rank equation 


(8.8) w + 


Reasoning for w” from this equation as for x" from F(x) =0, we find that 
w"=0 (mod p”*-°), where 0<o0<e@;. Hence the partial container of x” is 
p-"8+¢, This shows that a point on the upper boundary of the numeric sec- 
tor is either on or within e8; units above the ray through the origin with slope 
a,—B. This completes the proof of the theorem, which incidentally justifies 
the use of the term “sector” for the portion of the plane containing ordinary 
numeric paths. 

9. We now calculate the numeric paths. 


THEOREM 9.1. If p* is the partial container of (u,)=g(x) modulo p’, F(x), 
then the numeric of (u,) modulo p' is at least as great as the numeric of the unit 
sequence modulo p*-, 


CoroLiary. The numeric path of (u,) remains within a distance 6 above the 
lower boundary of the numeric sector. 


Proof. If ” is the numeric of (w,) modulo ‘, then by (8.1) 


(9.1) «"g(x) = 0 (modd F(x)), 
and if g(x)h(~) =p* (modd p', F(x)), then multiplying (9.1) by h(x) we obtain 
(9.2) «"p? = 0 (modd p‘, F(x)), 


whence x"=0 (modd p*-*, F(x)) and m must be at least as great as the nu- 
meric of the unit sequence modulo p*-*, which is the least solution of this 


congruence. 
In consequence of (9.1) we consider the coefficients of 
(9.3) [an ]/prai, 


which for p fixed, by a previous argument, satisfy recurrence (8.5). As before, 
the denominators cannot exceed p**‘. The coefficients of 


(9.4) / pnai 


4 


cannot for any single value of m all be divisible by p, since then the path of 
the unit sequence would come above the ray through the origin with slope a. 
Similarly, by the corollary to Theorem 9.1 the coefficients of (9.3) cannot for 
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a single value of n all be divisible by p*+!. Hence to determine the numeric 
path of (#,), we need only know the values of the coefficients of (9.3) 
modulo p*. But this is equivalent to knowing the values of the coefficients 
of modulo and these are sequences of integers sat- 
isfying (8.5). We are thus reduced to a finite problem, since the numeric and 
period of a sequence modulo p***? are finite. But in general 6 is not bounded 
for the set of all ordinary sequences satisfying the recurrence. 

10. The following properties of period paths may easily be verified to hold 
for numeric paths: 

(a) As we move to the right, the path never moves downward. 

(b) Every non-ordinary numeric path is an upward translation of an ordi- 
nary path. 

(c) The sum path of two paths passes through the lower of the two and 
through or above points of intersection. 

(d) The elements corresponding to paths passing through or above a 
point (”, 7) form an ideal B,,,;. 

Using these properties and Theorem 8.4, we easily deduce the following 
theorem: 


THEOREM 10.1. Through an arbitrary point of the numeric sector there is an 
ordinary path. 


There is no analogue to Theorem 4.3 for numeric paths, but there is an 


important relation of another sort. 


THEOREM 10.2. If any finite section at the beginning of a numeric path is cut 
off, the remainder is a translation of an entire ordinary numeric path. 


Suppose the numeric path of (u,)=g(x) passes through the point (, j). 
We have 


(10.1) x%g(x) = pih(x), 
where h(x) 40 (mod p). The section of the numeric path of (u,) beyond (m, 7) 
is the numeric path of (v,) h(x). For if 
(10.2) xmtme(x) = pite(x), 
where é(x) 40 (p), then «™[xg(x) ] =x™[péh(x) ] = pi*it(x) or 
(10.3) x™h(x) = pit(x), 
and conversely. 
One consequence of this theorem is that no small section of a numeric 


path can have peculiarities not exhibited by a path in the neighborhood of the 
origin. For example, no path may at any point rise more rapidly than the first 
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segments of the upper boundary or more slowly than the first segments of 
the lower boundary. The existence of a variety of path structures may be 
deduced from these two fundamental] theorems by adding appropriate paths. 

11. In one particular instance we may by a finite process predict all nu- 
meric paths. By Theorem 8.5 and the corollary to Theorem 9.1 there will cer- 
tainly be infinitely many numeric paths when the upper ray is distinct from 
the lower ray. But when these coalesce there are only a finite number of paths. 
This happens only when a+6 =a,. But aSa,/e and BS (e—1)a./e. Hence we 
must have a=a,/e and B=(e—1)a./e. We may take z=x*/p* and we find 
that N(z) = (b.)*/p**e40 (p). Hence in the recurrence (8.5) c,#0 (p) and the 
(v,) sequences are purely periodic modulo p‘. Consequently the lower bound- 
ary of the numeric sector must touch the ray infinitely often. If it passes 
through a point (, 7) on the ray, 


(11.1) a" = pih(x) (F(x)), 


where h(x) 40 (p, F(x)) and j/n =a. If h(x) =0 (p, x), then h(x)*=0 (p, x*) =0 
(p, F(x)). Then (F(x)). But (je+1)/ne >a and this 
would mean that the lower path goes above the ray. Hence h(x) 40 (p, x) and 
the partial container of «” must be p/, and the upper boundary must also pass 
through (n, 7). Similarly the lower boundary must pass through every point 
on the ray through which the upper boundary passes. The boundaries form a 
series of sausage-like loops about the ray. 
Let (7, 7) be a point on the numeric path of (w,)—g(«). Then 


(11.2) = pit(x) (F(x)), 
where ¢(x) 40 (p, F(x)). Multiplying (11.1) by x’g(x) we obtain 
(11.3) antre(x) = pitt(x)h(x) (F(x)), 


and here ¢(x)h(x) #0 (p, F(x)) since h(x) is relatively prime to the modulus 
and t(x) 40 (p, F(x)). Hence (+r, j+7) is on the numeric path of (u,) and 
similarly (2n+r, 27+7) and so on. Consequently the entire path structure is 
given by that in the first loop. 

12, In the study of both period and numeric patterns, it has been shown 
that the pattern of (u,)<2g(x) depends on which ideals of the A;,; (or Bn.) 
contain g(x) and which do not. These are a subset of the primary ideals con- 
tained in a single prime ideal [p, h(x) ]. Hence the following equivalence cri- 
terion is of interest: 


THEOREM 12.1. A necessary and sufficient condition that elements y and 2 
of R(x) belong to the same primary ideals contained in |p, h(x) | is that they have 
the same partial container p* (modd p‘, F(x)) and that 
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(12.1) z= yu (p*, F(x)), y = zv (p*, F(x)). 


Necessity. Let the partial containers of y and z be p*! and p*, respectively. 
If 6;% 62, suppose 6;< 4. Then [p*, z, F(x)] does not contain y. For if it 
does, then 


(12.2) y = bz (p*s, F(x)). 


There exist w40 (p, F) with yw=p* (p’, F) and s40 (p, F) with ss=p*® 

(p’, F). Hence swy=bswz (p*, F) and so 

(12.3) ps = 0 (pF); =O F) 

contradicting s #0 (p, F). Consequently 6; = 6: = 6. Now the supposition that 

zisin [p*, y, F(x) | and that y is in [p*, z, F(x) | leads to the congruences (12.1). 
Sufficiency. Any primary ideal A contained in [p, h(x) ] must contain some 

ideal F(x) ] with a sufficiently large 7. From (12.1) and yw=p?* F(x)) 


z= yu + F(x)) 


(12.4) 
= y(u + wr) 
= 
Similarly y=zv* (p’, F(x)). A fortiori 
(12.5) z= yur (mod A), y = zv* (mod A); 


hence y=0 (mod A) implies z=0 (mod A), and conversely. 
V. CYCLES AND RESIDUAL GROUPS 


13. The residues of a sequence (v,) modulo m consist of a numeric sec- 


tion, #1, , Un,-1, Where mo is the numeric modulo m, and a periodic 
section #,,, - - - . The periodic section consists of infinitely many repetitions 
of the cycle tn,, Ungsr-1, Where is the period modulo m. Any set 


of 7 consecutive terms from the periodic section will be called a cycle. Each 
cycle is a cyclic permutation of any other. For most purposes it is convenient 
not to consider these cycles as distinct, and in this sense a sequence (u,) has 
but one cycle modulo m. Whenever it is important to distinguish these, we 
shall call any particular one of them an ordered cycle. For the rest of the 
terminology used in this section, see Ward [12]. 

Here we shall restrict ourselves to the case in which the modulus is a prime 
p. If plax_,, p|ax+,---, plax, then the numeric section is included in 
Uo, - * *, Mr. and these terms, being initial values, are arbitrary. Hence, 
without loss of generality, (u,) may be supposed purely periodic, as the peri- 
odic section is independent of the numeric section. In addition, as the Theo- 
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rem of Kronecker holds true modulo p, with respect to the recurrence of 
lowest order which (z,) satisfies modulo p, p is not a divisor of N(u,,). Hence 
(Ward [11], p. 604) the period of (u,) modulo # is the principal period. 

In this section we shall always use the secondary isomorphism 


(13.1) x"g(x) — Un, 
and a cycle will correspond to the elements x"g(x), (n=0,--+-,7-—1). 


THEOREM 13.1. The elements corresponding to a cycle of (un) form a coset 
of the cyclic group {x} in the group of residues relatively prime to the modulus 


p, f(x). 


Proof. The elements relatively prime to the modulus p, f(x) form an 
abelian multiplicative group. As we are assuming (,,) to be purely periodic, 
pia, and the element «x is in this group, and so the cyclic group {x} is also 
in this group. As p/N(u,), g(x) is relatively prime to the modulus. x7=1 
(modd p, f(x)) is the congruence for the (principal) period modulo p. Hence 
{x} consists of the elements 1, x,---, «71, and the elements g(x), 
xg(x),-- +, form a coset of this subgroup. 

As a finite abelian group, the residues relatively prime to p, f(x) are char- 
acterized completely by the generators and order invariants of the group. 
By a slight modification of the methods used in T. Takenouchi [10] these 
may be found explicitly. A knowledge of the generators and invariants will 
be presupposed here. Except when f(x) has factors modulo p of high multi- 
plicity they are very simple. 

THEOREM 13.2. If y"g(x)—>v, by the secondary isomorphism, then (v,) satis- 
fies a recurrence whose characteristic is the minimal polynomial of y. 

For let 
(13.2) F(y) = byt !—--- —b, =0. 

Certainly y, as a polynomial in x, satisfies an equation of degree k, but 
it may possibly satisfy an equation of lower degree. The elements y"g(x), 
(n=0, 1,-- - ), satisfy 

(13.3) Unis = + + 

and hence their leading coefficients (v,.) must also satisfy it. 

In studying cycles and the distribution of residues therein, we shall use 
not only the group G of residues prime to p, f(x) (Theorem 13.1) and auxiliary 


recurrences (Theorem 13.2) but also theorems on the existence of sequences 
with zeros in prescribed positions, of which the two following are typical. 


THEOREM 13.3. There exists a sequence (un) satisfying (2.1) and not identi- 
cally zero modulo p for which u,=0 (mod p) for k—1 arbitrary values of n. 
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Proof. Write = Where is the 
unit sequence and the c’s are to be determined by the following k—1 con- 
gruences: 


(13.4) tha, = Ung = = Un,_, = 0 (mod P). 


These are k—1 linear homogeneous congruences in k variables, and so there 
must exist a solution in which not all the c’s vanish, and in turn (u,) does not 
vanish identically. To know the exact number of solutions, we must know the 
rank of the w-matrix involved, and in general this is not known. 
The following notation and terminology are almost exactly those of Ward 
[12], p. 170: 
T =period; u=reduced period; 
e=exponent to which basic multiplier m belongs mod ); 
¢(f) =number of residues prime to #, f(x); 
x =number of blocks. 
= m(p, f(x)); 
(13.5) T= th; d= p-—1; 
= (p — 1)ux = 
THEOREM 13.4. Let 0<a<y. Then among the representative reduced cycles, 


one from each block, there are precisely (p*-?—1)/(p—1) pairs of zeros in posi- 
tions differing by a. 


Proof. In an unordered cycle there are as many pairs of zeros differing 
in position by a as there are ordered cycles made from it with zeros u, and 
Un+a, # fixed. In the block there will be p—1 times as many pairs of zeros 
differing in position by a as there are in a representative reduced cycle. Hence 
we must find the number of solutions of 


(13.6) Un = Unta = 0 (mod 
not vanishing identically and then divide by p—1. If we write (un) 
= (CoWn+ Where (wp) is the unit sequence, then the num- 
ber of solutions of (13.6) will depend on the rank of the matrix 
(13.7) ), 


The rank of this matrix is two. It cannot be zero, for k consecutive terms of 
the unit sequence may not vanish. It cannot be one, for then the two rows 
would be proportional and a would be a multiple of the reduced period. Hence 
(13.6) has p*-* solutions, of which one vanishes identically. 
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We are now in a position to prove the very interesting theorem: 


THEOREM 13.5. If f(x) is irreducible modulo p, and b; is the number of zeros 
in a reduced cycle of the ith block (the blocks having been appropriately ordered), 
then we have the following equations* 

= (pF — 1)/(p 1), 
(13.8) (6? — = — — 1)/( 1), 


bibite = (pk * — 1)/(p — 1), 
Lemma 1. If f(x) is irreducible modulo p, then 
(13.9) e = (7, p — 1). 


Let s=(7, p—1). From (13.5) both 7 and p—1 are multiples of e, and 
hence s is a multiple of e. Let r=ws, where w must be a divisor of u. From 
the congruence for the period we have 


(13.10) (x”)* = 1 (modd 9, f(*)) 
or, writing x” =z, 
(13.11) = 1 (modd f(x)). 


Now as s is a divisor of p—1 this congruence must have s rational solutions, 
namely those of z*=1 (mod p). But since f(x) is irreducible, the residues 
modd », f(x) form a Galois field and no equation can have more solutions 
than its degree. Hence all the possible values of z in (13.11) must be rational 
and we have 


(13.12) x” = rational (modd 4, f(x)); 
whence w must be a multiple of uw. But as w is also a divisor of u, we must have 
(13.13) w= uy, e=r=(r,p-—1), 


and the lemma is proved. 
From the properties of the group of residues modulo #, f(x) we may find a 
primitive root y (modd 4, f(x)) such that 


(13.14) = x (modd #, f(x)). 


The period of y is p*—1 and by application of Lemma 1 to y-cycles (Theorem 
13.2) the reduced y-period is wx=(p*—1)/(p—1). There is but one y-cycle 


* In this theorem it is tacitly assumed that the recurrence is at least of third order. There can be 
no zeros in a cycle of a first order recurrence. In a second order sequence there is only one reduced 
cycle which contains any zeros, and this contains only one zero. A partial criterion for the appearance 
ofa "i of p in a second order sequence was given in Ward [13] and a complete criterion in 
Hall [5]. 
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and the x-cycles, cosets of {x} in G are obtained by starting from any term 
of the y-cycle and taking every xtth term after that. The reduced x-cycles 
form the terms y"*‘+‘, where n=0,1, - - - , w—1, and different (ordered) re- 
duced cycles will correspond to different values of 7. 

Lemma 2. Two reduced x-cycles corresponding to i, and i2 will belong to the 
same block if and only if i;=i2 (mod x). 

The number of blocks is x, and hence it is sufficient to show that multiply- 


ing a reduced x-cycle by y* takes it into another reduced cycle of the same 
block. From (13.9) and (13.13) 


(13.15) e = (r, p — 1) = (we, ef), 
whence 
(13.16) (u, t) = 1. 


Hence the equation us+é/q=1 has integral solutions s and g. Multiplying we 
get uxs+xlg =x. Thus y* = y**y*'2, But wx is the reduced period of y and hence 


(13.17) = g (modd 9, f(x)), 
where g is rational. From (13.14) and (13.17) we now have 
(13.18) = (modd f(x)), 


and multiplication of a reduced x-cycle by such a quantity takes it into an- 
other reduced cycle of the same block. 

In consequence of this lemma, we may choose as representative reduced 
cycles, one from each block, the terms 


(13.19) 


Lemma 3. There are as many zeros in the terms y"**‘ as there are in the terms 
wheren=0,1,---,u—1. 

In any cycle, two terms which differ in position by a multiple of the re- 
duced period are both zeros or neither. Applying this to the y-cycle, as 
n=0,1,--+,p—1, wxt+i and nx+i take on the same residues modulo px 
and hence must contain the same number of zeros. 

From Lemma 3, b; is the number of zeros in the terms y"*t+‘, (m=0,-- -, 
u—1). From Lemma 2 we shall have enumerated one cycle from each block if 
we take i=0,1, - - - , x—1. Together all these terms form a single reduced y- 
cycle. By Theorem 13.4 in this reduced y-cycle there will be (p*-?—1)/(p—1) 
pairs of zeros differing by a in position for every a from 1 to u«—1. Two zeros 
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belong to the same x-cycle if and only if they differ in position in the y-cycle 
by a multiple of x. Among the differences 1, - - - , wx—1 there are »—1 multi- 
ples of x. Hence (u—1)(p*-?—1)/(p—1) differences arise from pairs of zeros 
in the same x-cycle. But from the cycle of the ith block there are };(b; —1) 
differences. The second of equations (13.8) arises from equating the two meth- 
ods of enumerating these differences. Two zeros of the y-cycle differing in 
position by a number congruent to r modulo x, will be in blocks 7 and i++. 
Enumerating these differences we obtain the final equations of (13.8). The first 
of equations (13.8) states merely that there are a total of (p*-!—1)/(p—1) 
zeros in the reduced y-cycle or p*-'—1 zeros in the complete y-cycle. Or in 
other words, there are p*-'—1 residues modulo , f(x) whose leading coeffi- 
cient vanishes, excluding the residue zero. 
If we consider the third order recurrence 


(13.20) = — — Un 


and its cycles modulo 11, we find that »=19 and x=7. The b’s in order are 
3, 3, 1, 3, 1, 1, 0 and it is easily verified that these numbers satisfy equations 
(13.8). Similarly for the fourth order recurrence 


(13.21) = — Uni3 + Un+2 + Unt+1 — Un 


considered modulo 5, we find that 4=13 and x=12, and that the b’s in order 
oe 3, 5, 2,3,0, 3,2, 3,4, 4,1. 

In both these instances there are sequences satisfying the recurrence 
which contain no multiples of », namely those belonging to the block whose 
b=0. It is interesting to ask if there must be zeros in every cycle if the reduced 
period is large enough. This is answered by the following theorem: 


THEOREM 13.6. A sequence will contain multiples of a prime p if f(x) is ir- 
reducible modulo p and p > p*!. 


This theorem is proved by obtaining u < p*/? as a consequence of the as- 
sumption that some b=0. Suppose that one of the b’s is zero. We observe that 
if the sum of a fixed number of real numbers is fixed, then the sum of their 
squares is a minimum when all are equal. Hence 


Substituting the known value of >>); and >.b2 from (13.8), we obtain after 
some calculation 

4 (p 1) 


p*? 1 


(13.23) p*? — (p — 1) 


whence certainly yp is less than p*’?. 
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1. Introduction. In my earlier papers on p-way matrices and associated 
forms, I introduced new rank concepts of higher dimensional matrices with 
the aim of obtaining a fairly general theory of equivalence of ordinary forms 
and multilinear forms of arbitrary degree », where this theory contains as a 
special case the well known theory of equivalence of quadratic and bilinear 
forms under non-singular linear transformations in a given field. In a paper 
in these Transactions in 1936 I gave such a development for multilinear 
forms. The present paper is devoted to a theory of equivalence of ordinary 
forms. Specifically, the problem of determining necessary and sufficient con- 
ditions for the equivalence of ordinary forms and multilinear forms of arbi- 
trary degree to forms with “diagonal matrices” is solved in these two papers 
for the class of non-singular transformations in a field ¢, where ¢ is subject 
to slight restrictions in the case of ordinary forms. 

Throughout the present paper we shall use the expression “a sum of pth 
powers” to denote a sum of the form 


Mimi + 


where \y, ---, A. are constant elements of a given field; the coefficients 
Mi, Ae need not be unity. 

Let F =4;;...mXi4; - - - Xm be a form of degree greater than or equal to 2 
with coefficients in a field ¢, where the matrix A = (a;;...m) of F is symmetric; 
that is, Q..-12= = = = * = for 
example. Let G denote a grouping of the indices of F into two classes P;, Ps 
of partitions (multipartite indices), where P; contains an even number, 
greater than or equal to 2, of partitions. Since the partitions in P, and P»2 
play different roles in the general theory of forms, the partitions in P; are 
said to be signant while those of P2 are non-signant. To every G there corre- 
sponds uniquely a non-negative integer Dg defined elsewhere in terms of gen- 
eralized determinants associated with F. The integer Dg is not changed if ¢ 
is replaced by a field y which contains ¢. To every grouping g of the indices 


* Presented to the Society, April 10, 1936 and April 9, 1937; presented to the International 
Mathematical Congress, July 18, 1936; received by the editors August 19, 1937. 
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forms under non-singular linear transformations in a given field. In a paper 
in these Transactions in 1936 I gave such a development for multilinear 
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where \i, ---, A, are constant elements of a given field; the coefficients 
i, need not be unity. 

Let F =4j;...m%iX; - - - Xm be a form of degree greater than or equal to 2 
with coefficients in a field ¢, where the matrix A = (a;;...m) of F is symmetric; 
example. Let G denote a grouping of the indices of F into two classes P;, Ps 
of partitions (multipartite indices), where P; contains an even number, 
greater than or equal to 2, of partitions. Since the partitions in P; and P, 
play different roles in the general theory of forms, the partitions in P; are 
said to be signant while those of P: are non-signant. To every G there corre- 
sponds uniquely a non-negative integer Dg defined elsewhere in terms of gen- 
eralized determinants associated with F. The integer Dg is not changed if @ 
is replaced by a field y which contains ¢. To every grouping g of the indices 


* Presented to the Society, April 10, 1936 and April 9, 1937; presented to the International 
Mathematical Congress, July 18, 1936; received by the editors August 19, 1937. 
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of F into at least two partitions there corresponds a unique non-negative in- 
teger F, defined in terms of factorization properties of the matrix A associ- 
ated with F. The integer F, depends on the field ¢ in which the coefficients of 
F are embedded. The ranks of types Dg and F, are called determinant ranks 
and factorization ranks, respectively. Under non-singular linear transforma- 
tions =Diatd , Xj=bigxs , Xm=bmyx/ on F these ranks are invariant.* 

To give the theory of this paper completeness, if F is a linear form a,x; #0, 
we define for F one rank Dg and one rank F, each equal to 1. 

Let 7,, (p; 7, s, - be the number of distinct permutations of p in- 
tegers chosen from 1, - - - , m, where r are alike, s are alike, - - - , tare alike, 
and r+s+ --- +=; and where there are at least two numbers in the set 
Evidently, 


p! 


Let kn» be the class of all integers 7,, (p; 7, s, - - -, ¢), for given n, p. In the 
case where p=2 a field @ will be said to be (m, p)-proper if its characteristic 
is different from all prime factors of the numbers in the class k,. Every field 
is said to be (m, 1)-proper. 

The symmetric matrix A = (4;;...m) of the m-ary form F 
of degree p with coefficients in a field ¢ is unique if and only if the field ¢ is 
(n, p)-proper. 

An (n, p)-proper field is evidently (m, p)-proper, where m <n. 

If F is quadratic, there is only one determinant rank and one factorization 
rank, and they are equal. The latter rank is trivial in this case. If F is of higher 
degree, the ranks of F are not always equal. A class of equality and inequality 
relations which exist between the ranks of F have been obtained elsewhere, f 
but this class is not complete. 

I have proved, for example, that all determinant ranks of binary forms of 
any degree are equal when these ranks have exactly two partitions signant.{ 

If a certain determinant rank and a certain factorization rank are equal, 
all of the remaining ranks of F are equal. The leading contribution of the pres- 
ent paper is Theorem 2 which states that the ranks of an n-ary form F of degree p 


* R. Oldenburger, Composition and rank of n-way matrices and multilinear forms, Annals of Math- 
ematics, vol. 35 (1934), pp. 625-636, 645-646. We shall call this paper I. 

R, Oldenburger, Non-singular multilinear forms and certain p-way matrix factorizations, these 
Transactions, vol. 39 (1936), pp. 422-456, especially p. 432. We shall call this paper IT. 

t See paper I and its supplement, pp. 622-657. 

} R. Oldenburger, Relations between ranks of a general matrix, Annals of Mathematics, vol. 39 
(1938), pp. 172-178. 
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with respect to an (n, p)-proper field }, are equal if and only if F is equivalent 
under non-singular linear transformations in to a form 


Pp 
C = +--+ + + . 


By the equality of ranks of a quadratic form this theorem includes the well 
known result* that any quadratic form is equivalent in a field of character- 
istic not 2 to a form C with p=2. Here and in what follows “equivalent” will 
be used for “equivalent under non-singular linear transformations.” 

That a multilinear form M =4,;...m%ij - - - 3m has equal ranks for a field @ 
if and only if M is equivalent in ¢ to a form ayy; * 
was noted in an earlier paper. f 

With every form F given by 


°° Bm, 


where the matrix (4@;;...») is symmetric, we can associate a multilinear form 
M given by 


and uniquely determined by F. If F is equivalent under non-singular linear 
transformations in an (n, p)-proper field ¢ to a sum C of m pth powers, we 
shall say that F is non-singular with respect to d. Theorem II of the present 
paper implies that F is non-singular with respect to an (n, p)-proper field if 
and only if its associated multilinear form M is non-singular with respect to 
(that is, M is non-singular in the sense of paper II). We have thus succeeded 
in generalizing the definitions of non-singularity of quadratic and bilinear 
forms to general forms in such a way that the above property, obviously valid 
for quadratic and bilinear forms, also holds for general forms F and M. The 
importance of my earlier Transactions paper is now more clearly brought out. 
The conditions obtained there for determining whether or not a given multi- 
linear form is non-singular may be applied here to the form M to determine 
whether or not F is non-singular. 

Since it is not always a simple matter to determine the factorization ranks 
of a given form F, other conditions are obtained for the equivalence of F to 
a sum of pth powers. It is proved that if p is odd and greater than or equal to 3 
an n-ary form F is equivalent in an (n, )-proper field ¢ to a sum of pth 
powers if and only if the generalized determinant | x,a;;...m|, with j, - - - , m 
signant, is the product of linearly independent linear factors in ¢, and under 


* C.C. MacDuffee, The Theory of Matrices, Ergebnisse der Mathematik, vol. 2, part 5, 1933. 
7 R. Oldenburger, IT, p. 432. 


222 RUFUS OLDENBURGER [September 


reduction in @ to kxj --- x, the form F reduces covariantly to C. If p is 
even and greater than or equal to 4, the same statement applies except that 
| x;a;;...m| is replaced by | x:xjai;x-.-m|, With k, - - - , m signant, and this de- 
terminant is a product of squares of linearly independent linear factors. 
Hoéevar* sketched part of a proof of the property that a form factors into 
linear factors in the complex field if and only if it divides every third order 
minor of its Hessian, and he gave a method of finding these factors which 
involves, besides simply performed algebraic manipulations, only the solution 
of an algebraic equation E with an inequality side condition. Although the 
fact is not mentioned by Hoéevar, this proof is valid only if the form has no 
repeated factors involving variables. Hence, assuming that the roots of E 
have been found subject to the side condition, we can determine directly the 
equivalence of a form to a sum of pth powers in the complex field in a finite 
number of steps. In this sense the problem of determining whether or not a given 
form F is equivalent to a sum of pth powers is completely solved for the complex 
field. Tf F is at most quaternary and of odd degree, the equation E£ is of the 
fourth degree or less and can be solved by well known methods. The same 
statement applies if F is n-ary, where n <2, and of even degree. Whether or 
not F is equivalent to a sum of pth powers can be determined directly, in these 
cases, in a finite number of steps. 

There is a direct method, involving an induction process, of determin- 
ing whether or not the ranks of a form F are equal. It is based upon the 
theorem that an n-ary form F of degree p, (p22), is equivalent to C in 
an (n, p)-proper field ¢ if and only if one determinant rank of F is m and 
the forms F\=4jj...m¥j %m are simultane- 
ously equivalent in @ to sums of (p—1)th powers. It is to be observed 
that F=x,F,+ ----+2,F,. The problem of simultaneous equivalence of 
F,,---, F, to sums of (p—1)th powers, where these forms are quadratic, 
is quite different from the problem for which these are cubic or of higher 
degree. This is due in part to the essential difference between transformations 
x;=biate which bring 


into a similar form 
C’ =x? +--+ +A a/?, 


for p=2, and the transformations which bring C into C’ for p23. In fact, for 


* Hotevar, Sur les formes décomposables en facteurs linéaires, Comptes Rendus de |’Académie 
des Sciences, vol. 138 (1904), pp. 745-747. 
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p23 and an (n, p)-proper field ¢ the matrix B=(b,;.) of the non-singular 


linear transformation x;=Diatd , (t,a=1,2,---,n,n2r), bringing C into C’ 
is of the type 

B 0 
(1) B= | 

Bu Bag 


where By, is of order r, and has but one non-vanishing element in each row and 
column. The transformations which bring a quadratic form C into a similar 
form C’ are thus more complicated than the transformations which bring a 
form C of the third degree or higher into a like form C’. There is an analogue 
of this for multilinear forms.* 

It is to be noted that the shape of B in (1) depends on r and not on p. 
The solution of the general problem of equivalence in ¢ of Fi, ---, F, to 
sums of (p—1)th powers, (p=4), depends primarily on the following state- 
ments: 

1. A transformation with matrix B of (1) brings any sum of gth powers in 
%1, °°, %,in an (m, g)-proper field into a sum of gth powers with the same 
number of non-vanishing coefficients. 

2. For properly restricted fields the class of non-singular linear transfor- 
mations bringing a form C with p23 into a like form C’ is identical with the 
class of transformations which brings a set of forms L, M, - - - , Q into a like 
set, where L, M, - - - , Q are sums of gth powers, (¢=3), in m,- - - , x, and 
have the property that there exists for each variable x, in the set x, - - - , x, 
at least one form in the set having a non-vanishing coefficient for x,°. 

3. If an n-ary form F =4;;...mXiX; Xm is equivalent in an (n, p)-proper 
field ¢ to a sum of pth powers under transformation with matrix B of (1), the 
terms in F involving only x,4:, - - - , X, are equivalent in ¢, under non-singu- 
lar linear transformations on - , to a sum of pth powers. 

By these and other properties, the problem of equivalence of a form F to 
a sum of pth powers is solved by treating consecutively the equivalence of 
sets of subforms associated with F, where the forms in these sets are of lower 
degree than F. For the sake of brevity some of the more involved parts of the 
theory will be omitted. 

In the study of non-singular multilinear forms carried through in an 
earlier paper, difficulties aroset in the process of determining the non-singu- 
larity of a given form. The analogue for multilinear forms of the above induc- 
tion process avoids these difficulties entirely, so that it is a relatively simple 


* R. Oldenburger, II, pp. 425-427, 442-443. 
t R. Oldenburger, IT, p. 431. 
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matter to determine whether or not a given form is non-singular. Since the 
analogue involves few new features, its presentation will be omitted. 

The induction process when applied to binary forms leads to condi- 
tions, like those mentioned above, involving the generalized determinants 
|xiaij...m|, and | «jai;x...m|, except that the conditions are much stronger 
in that the determinants are replaced by ordinary second order 2-way de- 
terminants, and no distinction is made between even and odd values of Pp. 
Since the binary case involves at most the determination of whether or not 
(e)'? is in @, given that € is in ¢, we may consider the problem of equivalence 
of a binary form of any degree to be completely solved for a (2, p)-proper 
field in the sense that this equivalence can be recognized by simple direct 
steps. This development can be used to give necessary and sufficient condi- 
tions for the equivalence of an algebraic equation P(x) =0 to the equation 


y? —-A=0 


under linear fractional transformations in a (2, p)-proper field ¢. This prob- 
lem was considered for a finite field GF(p) by Brahana.* 

Bronowskif considered the problem of the equivalence under transforma- 
tions, not necessarily non-singular, of a form of the pth degree to a form of the 


type 


Bronowski translated the problem into one in geometry which has not been 
solved. 

Sylvester provedt that a fairly broad class of binary forms of degree p 
can be written in the complex field as sums of pth powers of linear forms. 
These linear forms are, however, in general not linearly independent. 

It will be proved elsewhere that every form F with symmetric matrix can be 
written in a field with p or more elements as a sum 


where 7 is finite, Zi, - - - , L, are linear forms, and qa, - - - , a, are in the given 
field. For a field with less than p elements this representation is not in general 
possible. The forms which can be represented as above, where 1;,---, L, 


* Brahana, Note on irreducible quartic congruences, these Transactions, vol. 38 (1935), pp. 395- 
400. 

T Bronowski, The sum of powers of a canonical expression, Proceedings of the Cambridge Philo- 
sophical Society, vol. 29 (1933), pp. 69-82. 

t J. J. Sylvester, Philosophical Magazine, 1851, p. 94. 
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are linearly independent, are the forms with which the present paper is con- 
cerned. In this case r takes on its minimum value for the class of forms of de- 
gree pin r variables, where the forms in this class cannot be written by means 
of non-singular linear transformations in terms of less than 7 variables, and 
these forms have symmetric matrices. 

It is a significant thing that the representability of a form of odd degree in a 
field > by a sum of powers 


aL? + 6M?+.---+dN? 


of linearly independent linear forms L, M, - - - , N depends directly on the fac- 
torability of a second form in o into a product kRS - - - T, of linearly independ- 
ent linear factors R,S,---,T. 

The generalized matrix method of approach used here is a new one in the 
theory of forms of degree higher than quadratic. 

Throughout the present paper when we equate two forms of degree p we will 
assume that the field of coefficients has p+1 or more elements, so that equality 
of forms implies equality of the corresponding coefficients. 

2. Ranks of general matrices and forms. Hitchcock* studied certain 
ranks of a p-way matrix. In another paper I proved that generalized de- 
terminant minors of the product 


( 


of two matrices A =(4;;...m) and B=(0nq...s) are not always sums of prod- 
ucts of determinant minors of A and B, but rather sums of products of de- 
terminant minors of matrices called “derivates” associated with A and B. 
In terms of determinant minors of A and derivates of A, I defined determi- 
nant ranks of A for any grouping of the indices into partitions and allowable 
signancy{ of these partitions, and proved the invariance§ of these ranks un- 
der non-singular linear transformations on the form 4j;...mXiVj 3m @SSO- 
ciated with A. The definitions of a few determinant ranks essential to the 
argument will be given explicitly. 


The matrix A = (@;;...m), (i,j, - - - ,m=1,-- - ,m), is said to be of order n. 
We shall assume in what follows that A is symmetric. The ranks of a form 
F =4j;...m¥iXj Xm are the ranks of its associated matrix A. 


* F. L. Hitchcock, Multiple invariants and generalized rank of a p-way matrix or tensor, Journal 
of Mathematics and Physics, vol. 7 (1927), pp. 40-79. 

Tt R. Oldenburger, I, p. 632. 

t For a discussion of signancy of generalized determinants see L. H. Rice, Introduction to higher 
determinants, Journal of Mathematics and Physics, vol. 9 (1930), p. 48. 

§ R. Oldenburger, I, pp. 633-635, 645-646. 
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The simplest determinant rank of the matrix A is the ordinary rank of the 
2-way matrix (a;,) obtained from A by letting 7 be the index of the rows of 
(a;,), and 7 the index of the columns of (a;,), where 7 is the partition of indices 
ranging over the values 1---11, 1---12,---, 41m, 
mm, where is the order of 
A. This rank will be called the principal determinant rank of A. It was used 
by Mayer in a paper in these Transactions.* 

Let the indices of A =(a;;...m) be grouped into partitions p, a, - - - , 7. 
The minimum value of ¢ for which A can be written in the form 


where b,,, , dar are in a given field ¢, is called the (po - - - r)-factorization 
rank of A with respect to ¢. The number « is always finite. The (77 - - - m)- 
factorization rank of A with respect to ¢ is called the principal factorization 
rank} of A with respect to ¢. That this rank depends on ¢ is evident from the 
following example. 

The form x*—3xy? has a matrix A =(a;;,) for which = de 


= (lg, = — 1 and all other elements vanish. The matrix A can be written as 
2 
: ( baibaibar), 
a=1 
where 
1 I } 
2 1/3 2 1/3 
(bai) = ( ( T= 1)'/2, 


(2)'/8 (2)1/3 


Since the principal determinant rank of A is 2, the principal factorization 
rank is at least 2. Therefore, the principal factorization rank of A with re- 
spect to the complex field is 2. In the complex field x*—3xy? is equivalent to 
x*+-y3 by the results of the present paper (Theorem [). In the field of reals 
the principal factorization rank of A is at least 3 since otherwise x*—3xy? 
would be equivalent in the field of reals to \x*+y* by Theorem I of this 
paper. It follows from Theorem IIIa of the present paper that these forms are 
not equivalent for this field. 


* W. Mayer, Die Differentialgeometrie der Untermannigfaltigkeiten des Rn konstanter Kriimmung, 
these Transactions, vol. 38 (1935), pp. 274-310. 

+ R. Oldenburger, On arithmetic invariants of binary cubic and binary trilinear forms, Bulletin of 
the American Mathematical Society, vol. 42 (1936), pp. 871-873. 
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If the principal factorization and determinant ranks of F =4j;..-m¥iXj* + * Xm 
are equal for a field for which (a;;...m) is unique, by the definitions of these 
ranks the form F is equivalent at once, for some , to a like form for which 


= (x den), 


a=l 
where (bai), (Cai), , (dam) are non-singular matrices of order n. The asso- 
ciated multilinear form M Zn iS equivalent to M’ yj --- 2f 
the matrix of which is Sam), Where 
(5ai) =(5a;)= + - + =(bam) is the Kronecker delta of order n. It is a fairly 


simple matter to show that all of the ranks of M’ equal m. Since the ranks of 
F are ranks of M, this proves the following lemma: 


Lemna I. [f the principal ranks of an n-ary form F of degree p are equal for 
an (n, p)-proper field, all ranks of F are equal. 


3. Necessary and sufficient conditions for equality of ranks. We shall 
prove the following theorem: 

THEOREM I. Let F be a given n-ary form of degree p, and let o be an (n, p)- 
proper field. The principal determinant rank of F and the principal factoriza- 
tion rank of F in ¢ are equal to n if and only if F is equivalent in ¢ to a sum of n 
pth powers. 

It is obvious that the theorem is true for linear forms. The result for the 
quadratic case is well known.* We shall assume in what follows that p23. 

Let Xm. From the definitions of ranks the principal 
ranks of F are equal to if and only if the matrix of F can be written as 


(2) (x den), 


c=l 


where (dai), , (dam) are non-singular. 
If F is equivalent in ¢ to a sum of pth powers, there exist non-vanishing 
constants \, in ¢ and a non-singular matrix (g.s) with elements in ¢ such that 


F= * Xme 
The matrix A =(a;;..-m) of F is now of the form (2) where (aai) = (AaZai); 
(bai) (gas); (dam) (gam), a not summed. 


Since (gas) is non-singular, (aa:), - - - , (dam) are non-singular. 


*C. C. MacDuffee, The Theory of Matrices. 
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Conversely, assume that the matrix A of F is in the form (2). Let (A*) 
be the inverse of (a,:). Applying the transformation 


AB xf 


we obtain a form whose matrix’ is of the form 


dam); 


where (6,) is a Kronecker delta. It is therefore no restriction on the general- 
ity of the method to assume at the start that (a.:) =(6.:). We shall prove 
that the form F is then a sum of pth powers. 

Write A = Catdam), and B= (bai), C= (Cat), D= (dam). 
Assume that by, =0. Since B is non-singular, (biz, - - - , bin) ¥(0, - - - , 0). The 
symmetry of A and the invariance of symmetry imply that 


for all values of k, - - - , . The right-hand members of (3) vanish. The direct 
product H=CX XEofC,---, Eis the matrix (cj, - - @m:) whose ele- 
ments are the possible products of elements of C, - - - , E. 

Display H as an ordinary 2-way matrix (h,,), where p is the partition 

j and a is the partition k - - - By a lemma proved elsewheref the 
determinant of (4,.) is a product of powers of determinants of C,---, E. 
Since these are non-singular, H is non-singular as displayed. Since the prod- 
ucts Ci, °- + 1 are the elements of a row of (h,,), and since this matrix is 
non-singular, these products cannot all vanish. It follows from (3) that 
=0. By the same argument cu= - =en=0. 


The symmetry of A implies that 
(4) bir €11dim = Cmidmi 


for every m. Since the left member of (4) vanishes, for every m some quantity 
in the set bmi, - - - , dm: Vanishes. For m=2 it is no restriction to take this 
to be ba. Since 


* The matrix can then be written as (bg; - - - dgm) where 8 is not summed. Necessary and suffi- 
cient conditions for the factorability of a matrix in this form were given in IT, p. 452. 
+ R. Oldenburger, I, p. 625; II, p. 442. 
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for every j, and since by the non-singularity of B (be, - - - , ben) ¥(0, - - - , 0), 
it follows that 


dy = 0. 
Since one of the quantities ca, - - - , da: vanishes, it is no restriction to take 
this to be ¢2;. Since 
(5) da = qaifar dai, 


for all i, 7, and beic2; cannot be zero for all 7, 7 (by the non-singularity of 
B,C), and since the left-hand member of (5) vanishes, 


dey = 0. 
It is seen by induction that gu= --- =dx=0. By the same argument we 
prove that 
bmi = = dm = 0 
for every m. Since B,---, D are non-singular, the elements dy, ---, du 
cannot be zero. It follows that no diagonal elements of B, - - - , D vanish. 
We shall now prove that the non-diagonal elements of B, - - - , D vanish. 


If some non-diagonal element of B vanishes, we may take it to be d12. By 
the symmetry of A 


(6) dy = dy = bo1C21921 dz; 


whence ¢2=0. Hence - - - =di2=0. By (6) some element in the set 
bei, - - - , de: vanishes. It is no restriction on the generality of the method 
to take this to be b2:. From 


+++ da = b22C21922 + dog 


it follows that c:=0. Hence gau= - -- =d2=0. We have proved that if 
b,, =0 for a given r, s, where then Crs, Csr, Ors, 

If elements other than },2 in the first row of B are zero, it is no restriction 
to take them to be dys, - - - , and to assume that , bin Then 


and 
bij, bi, C1j, dij, FO, 
By the symmetry of A, 


forj=2,---,r. Hence 
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It follows at once that dam=0, for a=r+i,---, m; m=2,---, 7, and 
Evidently, also ban= =€am=O0 for the 
same pairs of ranges of a and m. 
We have proved that 
br O 0 bin 
0 boo: be O 0 
(7) B= 0 0 ; 
bat 0 ---0 Dn Dan 
and C, - - - , Dare of the same form. If no element of B vanishes, B is of the 
form (7) where r+1=2, and the minor 
bee be, 


does not occur. The representation (7) hence includes all cases. 
If some elements of the first row of 


Ors 
BY = 
baa 
vanish, let these be b,41,-40, * 0-41,n, (622). By the above reasoning 


vanishes. Since this contradicts an earlier assumption, it follows that no ele- 
ments of B* are zero. 

By simultaneous interchanges of the rows and columns of B, - - - , D, cor- 
responding to permutation of the variables of F, we can bring B into the form 


bei Des 0 
B= s=n-—r+1, 
0 °° * 
0 On Dan 
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and C, - - - , Dinto similar forms where the elements of 


are all non-zero. 
Applying the above argument to the set of minors 


we conclude that B, - - - , D can be written in the form 
+508 
0 Bs---0 
0 O By 0 
where the elements of the minors B,,---, Bu,---, Di,- ++, Dy are not 
zero, and B;, - - - , D; are minors of the same order for every 7. 
Assuming that Bi, - - - , B, are not all of order 1, we may write B, - - - , D 
in the form (8), where B,, - - - , Di are of order R22. 
The determinant of B is 
| B| =| Bi| -| ---- -| Bel. 


We can write 


bri bre ber 
since the denominator is not zero. 
By the symmetry of A 


whence the first two rows of the above determinant are identical. Hence 


1 
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| B,| =0, whence B is singular. This gives a contradiction. Minors of (8) are 
of order 1 and 


b4,0---0 0 | dy,0---0 0 


0 0---0 dni 
Now 

F b aida; damXiX; ig = bu + + Ben , 
which completes the proof of the theorem. 

The above argument is valid if there are only two matrices in the set 
B,---,D. 

By Lemma I we can state Theorem I in the following form: 


THEOREM Ia. Let F and ¢ be given as in Theorem I. The ranks of a form F 
of degree p are equal in > if and only if F is equivalent in $ to a sum of pth 
powers. 


If the characteristic of ¢ is 2 and F is quadratic of rank n, F is always a 
sum of r<m squares since a;;+a;;=2a;;=0. This generalizes for p-ic forms, 
(p=3). We therefore have the theorem: 


THEOREM II. Let @ be a field whose characteristic is a factor of all numbers 
in the set kn» of §1, or an-(n, p)-proper field. The ranks (or the principal ranks) 
of an n-ary form F =4j;...mXiXj + + * Xm Of degree p with respect to a field are 
equal if and only if F is equivalent in } to a sum of pth powers. 


4. A condition for the equivalence of forms of odd degree to sums of pth 
powers. Let A =(a;;...») denote a p-way matrix, wherei,7, ---,m=1,---,m, 
and p is even. Let m, - - - , m, denote m distinct values of ; 71, - - - , jn dis- 
tinct values of 7; and so on. Let (e4**'*") be the generalized Kronecker delta for 
which if is an even permutation of 1, ---, m, and 
equal to —1 if (ji, - - - jn) isan odd permutation of 1, - - - , w. The determi- 
nant of A with all indices signant is defined* to be 


where the summation is over all distinct permutations of the numbers in the 
sets °°, (m,-- +, m,). It is of order n. 

Let an n-ary form F=4;;...X;X; + - - Xm be of odd degree p. The 7-char- 
acteristic determinant of F is defined to be the determinant | xsai;. : - with all 
indices [j, - - - , m] signant. If F is of degree 3, this is the Hessian of F except 
for a constant factor. We shall prove the following theorem: 


* R. Oldenburger, I, p. 631. 
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THEOREM III. An n-ary form F of odd degree p= 3 is equivalent in an (n, p)- 
proper field d to a sum of n pth powers, if and only if the i-characteristic determi- 
nant D of F factors in $ into linearly independent linear factors, and under re- 
duction in of D to canonical form Kx,, -- - , x, F transforms covariantly to a 
sum of pth powers. 


The 7-characteristic determinant of G=);x,? is the determinant 


where 6 =(6;;...m) is the generalized Kronecker delta whose only non-vanish- 
ing elements are 


= = = 1. 
Evidently 
(10) D= Xn, 
where K =}, - - - \,. Under non-singular transformations 
(11) = 


on G to give a form F, D transforms by a general theorem on products of 
determinants* into 


D' =| pix}, 


where P =();;). The determinant D is thus a covariant of G. It follows that 
the z-characteristic determinant of any form F, equivalent to G, factors in @ 
into linearly independent linear factors. This is therefore a necessary condi- 
tion for the equivalence in ¢ of a form to a sum of pth powers. 

Assume that the 7-characteristic determinant of F factors in ¢ into linearly 
independent linear factors L, M, - - - , N. Applying the transformations 


ai =L,---,% =N 
to F we obtain a form F’ whose i-characteristic determinant is of the form 
Kui 
We therefore consider only forms F where D satisfies (10). We shall prove 
that such a form is equivalent to a sum of pth powers if and only if it is al- 
ready a sum of pth powers. If F is equivalent to a form G, then G is equivalent 


to F under a transformation (11) which brings (10) covariantly into 
gxi --- x, for some g. For such a transformation 


* R. Oldenburger, I, p. 632. 
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| P| KIT = gad 


By the unique factorization property, similarly used elsewhere,* the linear 


expressions p~i;t/,---, Pnayx/ are equal in some order to the products of 
xi by constants in ¢. The transformations on G which bring D into 
qui ---,' are of the form 

(12) = i,j = 1,--+-,m;7 not summed, 


where there is one 7 for every 7, and conversely. Since (12) transforms G into 
a sum of pth powers, F is a sum of pth powers. 
In the binary cubic case we have simply the following theorem: 


THEOREM IIIa. A binary cubic form F = jx, is equivalent in a field 
with characteristic not 3 to a sum of two cubes if and only if the i-characteristic 
determinant of F factors in $ into distinct linear factors. 


To prove this theorem it is only necessary to show that if the z-character- 
istic determinant D of F factors in ¢ into kx, for some k~0, then F is a sum 
of cubes. It is to be noted that ¢ is (2, 3)-proper if and only if its characteristic 
is not 3. Let din =a, =f, 6. Then a simple direct calculation 
shows that the i-characteristic determinant of F is (ai —y?)a?2 + (aB—~y5)ax2x2 
+ — 6*)x?. If this is to be of the form kx x2, we must have ad —y?=~8 — 
=0, a8 —yé5 +0. It follows readily that y = 6=0, a, 840; thus F is of the re- 
quired form. 

That the factorization of the i-characteristic determinant into linearly inde- 
pendent linear factors is not in general enough to insure the equivalence of an 
n-ary form F of degree p to a sum of n pth powers follows from an example: 

Example. Let F =6x,x2x; with matrix A =(a;;,). The only non-vanishing 
elements of A are = = = A231 = = = 1. Let be a (3, 3)-proper 
field. The i-characteristic determinant of F is D=2x,x.x;. Except for an in- 
tegral factor it is the Hessian of F. Now Dis a product of linearly independent 
linear factors in ¢. Since D is in canonical form, if F is equivalent to a sum of 
three cubes, by Theorem III, F is such a sum. The form F is not such a sum. 

The determinant | A| of a p-way matrix A =(aj;...m) of order n, (p odd), 


where j, - - - , m are signant, is given by (9). We may have | A| 0, even if 
Gaj.--m=;...m for a#B. The determinant is said to be of order n. The 
(j -- - m, i) derivate of A is the matrix obtained by adjoining (n—1) ma- 


trices all equal to A in the direction associated with the index 7 in the p-space 


*R. Oldenburger, Equivalence of multilinear forms singular on one index, Duke Mathematical 
Journal, vol. 2 (1936), p. 674. See also B. L. van der Waerden, Moderne Algebra, Springer, 1930, vol. 1, 
pp. 63-66. 
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representation of A. The rank r[j - - m,i] of A and F =ajj...m¥iXj Xm iS 
the upper bound of the orders of the non-vanishing determinant minors of 
this derivate, expanded with j, - - - , m signant.* 


For an (m, m)- and (n, p)-proper field the z-characteristic determinant of F 
can be written as 


where the summation is over all distinct choices I of the values of 7; - - - in 
from the set 1,2, - - - ,m;kr=1/a!8! - - - y!, where, for example, a of the in- 
dices in I’ are equal, 8 other indices are equal, and | a;;...m| is a determinant 
minor of the (j - - - m, i) derivate of A with, - - - , msignant and all minors 
of this derivate occur in (13). There follows the theorem: 


THEOREM IV. Let be an (n, n)- and (n, p)-proper field. The i-characteristic 
determinant of the n-ary form F =4j;...mXiX; - - - Xm of odd degree p, (p=3), is 
not identically zero if and only if the rank r|j - - - m, i] of F is n. 


By a theorem proved elsewhere,f the principal determinant rank of F is 
greater than or equal tor[j - - - m, i]. Hence we have the following theorem: 


THEOREM V. Let ¢ be a field satisfying the assumptions of Theorem IV. If 
the i-characteristic determinant of an n-ary form F of odd degree greater than 
or equal to 3 factors into linearly independent linear factors in o, the principal 
determinant rank of F is n. 


5. The equivalence of forms of even degree to sums of pth powers. With 
an n-ary form 


F = Xm 
of even degree p, (p=4), associate the determinant 
E= | 5H 


with all indices signant. We shall call this the zj-characteristic determinant of F. 
We shall prove the following theorem: 


THEOREM VI. An n-ary form F of even degree p, (p =4), is equivalent in an 
(n, p)-proper field to a sum of pth powers if and only if the ij-characteristic 
determinant E of F factors in $ into squares of linearly independent linear fac- 
tors, and under reduction of E in to canonical form Kx? - - - x,2, F transforms 
covariantly to a sum of pth powers. 


* R. Oldenburger, I, p. 633. 
t R. Oldenburger, I, p. 641. 
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The proof of this theorem is very similar to that of Theorem III. The 
transformation from F to 


F’ = 4j;...mbiabjg +++ OmyXd 


under the non-singular transformation x+;=);.%. corresponds, by a theorem 
on determinants proved elsewhere,* to the transition from E to 


| xg | | Dey | 


The determinant E of (i=1, - - - , - 
Evidently, the ij-characteristic determinant of a form F equivalent to G is of 
the form KL? - - - N*, where L, - - - , N are linearly independent linear fac- 
tors. 

Assume that the determinant E of F factors, as required in the theorem, 
and that F is equivalent to G. Letting x =L, - - - , x,’ =N we can transform 
the form F into a new form for which E=qxj? - - - x,'?. Since G is equivalent 
to F, there is a non-singular transformation x; =0),;.%. such that 


| bia )? (Onatd )? = gui? 


It follows from this identity that the transformation x;=};.%2' is of the form 
(12). Since the transformation (12) brings G into a sum of pth powers, F is a 
sum of pth powers. 

Let A =(aijx..-m) be a p-way matrix, (p=4 and even), of order m. The 
determinant of A with k, - - - , m signant is the expansion (9) except that 
e+ -in does not occur. The indices 7 and 7 need not range over distinct values. 
The (k - - - m, ij) derivate of A is a matrix obtained from A by adjoining 
matrices equal to A in the directions of p-space associated with the indices 7 
and j. The rankt r[k - - - m, ij] of A and F=a;;...mxixj + - - Xm is the up- 
per bound of the orders of the non-vanishing determinant minors of the 
(k -- +m, ij) derivate of A with k, - - - , m signant in these determinants. 
For an (, p)-proper field @ with a characteristic not equal to 2, 3,---, &n, 
where &, is a positive prime depending on 1, the coefficients in the expansion 
of the ij-characteristic determinant of F can be written as sums of mth order 
determinant minors of the (k - - - m, ij) derivate of A with k, - - - , m sig- 
nant. We therefore have the following theorem: 


THEOREM VII. For a field with characteristic different from 2,3,- +--+, &n, 
where &, is finite and large enough, the ij-characteristic determinant of an n-ary 
form F =4j;..-mXiX; ++ + Xm of even degree p, (p=A4), is identically zero if the 
rank r{k-- +m, ij|<n. 


* R. Oldenburger, I, p. 632. 
T R. Oldenburger, I, p. 633. 


— 
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Since r[k - - - m, ij] is not greater than the principal determinant rank 
of F,* we have the following analogue of Theorem V. 


THEOREM VIII. Let ¢ be a field with the properties of Theorem VI. If the 
ij-characteristic determinant of an n-ary form F of even degree greater than or 
equal to 3 factors into squares of linearly independent linear factors in ¢, the 
principal determinant rank of F is n. 


6. Method for obtaining linear factors. Let an n-ary form F =4;;...mXix; 
+++ Xm, (i, 7, +--+, m=1,---,m), of degree p, (p=2), with elements in the 
complex field, be denoted by F(x, - - - , %n). 

Making the non-singular transformations , +42, 
+x,’ on F we obtain +a/,-- +, vat +2,/). 
Assuming that F 40, by the continuity of F, we have F(1, \, - - - , v) 40 for 
A, - - -, v0. Since this is the coefficient of x’? we have the following lemma: 


Lemma II. A non-identically vanishing form F of degree p in x1,-- +: , Xn 
is equivalent in the complex field to a form for which the coefficient of x is not 
zero. 


Assume that F has a term in x,” and does not contain a repeated irreducible 
factor.{ The last assumption implies that the resultant of F and 0F/dx; in- 
volving x2, - , X, is not identically zero in - - - , We can therefore 
choose values de, - - - , @, such that 


OF (x1, An) 


F(x1, do, Gn) = 0, 
Ox) 


0 
have no root x; in common. 

By a theorem from implicit function theory it follows that x; is represented 
analytically by distinct power series P;(%2, -- -,%n),- at 
the distinct points (a1, d2,---,@n), (@p, a2, , dn) for which 


OF 
F(a, ** Gn) = 0, 0. 
dx, 


According to Hoéevart the derivatives of x, with respect to x2, --- , Xn 
of order higher than one involve the third order determinant minors of the 
Hessian H, of F in such a way that when these minors vanish simultaneously 
for a point - - , where 0F/dx,~0, these derivatives also vanish. It 


* R. Oldenburger, I, p. 641. 

{ For the case of two variables see G. A. Bliss, Algebraic Functions, American Mathematical 
Society Colloquium Publications, vol. 16, New York, 1933, pp. 29-30. 

t Hoéevar, op. cit., pp. 745-747. 
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follows that if F is a factor of these minors of Hr, the derivatives of x; of 
higher order than the first vanish at all points for which F =0, 0F/0x,~0. In 


this case the series P;, - - - , Py involve only terms of the first degree in 
Xe, °°, %,. Except for constant multipliers, the polynomial factors of F are 
the expressions x,—P,, - - - , x:—P, which are linear. 


Conversely, if F factors into linear factors, it can be shown at once that 
the third order determinant minors contain F as a factor. 


Except for constant multipliers, the expressions «;—P;, (i=1,---, p), 
are the same as 
OF OF 
Ox pt OXn 
where #; is the point (a;, dz, - - - , dn). These expressions are linearly inde- 


pendent if and only if the matrix 


is of rank greater than or equal to p. For the D of Theorem III, and p=n, 
this condition can be stated in terms of the non-singularity of M. We have 
the theorem: 


THEOREM IX. Let F be a form in x, - - - , xn of degree n. The form F factors 
in the complex field into linearly independent linear factors if and only if the 
following set of conditions is satisfied: 

(1) F has no repeated factors. 

(2) F divides the third order determinant minors of its Hessian. 

(3) At points (a1, d2,- ++, Gn), Pn (Qn, On) on F=0 for 
which there are distinct power series expansions for x, in terms of x2, - ~~ , Xny 
the matrix following is non-singular: 


| OF OF 

LO%nIp; 

| OF OF 7 


7. An induction process. We shall prove the following theorem: 
THEOREM X. Let 


F = Xm, 


— 
_ | 
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be a form of degree p=2, and let be (n, p)-proper. The form F is equivalent 
in > to a sum of n pth powers if and only if 

(1) the principal determinant rank of F is n; 

.nultaneously equivalent in ¢ to a set of forms of the type 


p—l p-l p—l 
Gy = +--+ + ty Ga = + + 


If p=2 the theorem is evidently true. Assume that p23. If F is equiva- 
lent to 


there exists a non-singular linear transformation 


(14) = DiaVa, 4,a=1,+--,m, 
such that 

where (bia) = (bjs) = - - - =(b,_). Equation (15) implies the following matrix 
identity: 

(15’) bey) = (Cap---7); 


where C=(Cag...y) is the matrix of G taken with all elements zero except 
(=1, - - - , m). Let the inverse (b;.)~! of (bia) be denoted by (B@*). 
The identity now implies 
(16) Dey) = (Cag..-~B). 
Setting i=1, - - - , min (16) we obtain the following relations between minors 
of the matrices in (16): 


These equations yield 


If we let u,,=c,....B**, the right members of (17) become G,, - - - , Gu. 
Since the principal determinant rank of G is obviously m, this rank of F 
is n if F is equivalent to G. 


| 
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We have proved the necessity of the conditions. To prove the suffi- 
ciency we assume that the principal determinant rank of F is n, and that 
there exist non-singular linear transformations (14) in ¢ and a choice of y;;, 
(i, 7=1,---,m), such that 


* * Vy = Gn. 
By (18) 


Substituting (14) in (19) we obtain 


Since A is symmetric, the matrix of G;b;.y, is also symmetric. Denote this 
matrix by D=(d,s...,). Then 


dap... = Mesbea 


(the repeated index on the left does not indicate summation) while all other 
elements of D vanish. By symmetry d,3...3=d3-..-sa. If a¥B, then dg...s.=0, 
whence dag...3=0 also. Hence at most d;...1, - - - , dn..-, are non-zero. Since 
D is of principal determinant rank n, these are all non-zero. It follows that 
F is equivalent in ¢ to a form of type G, and that the transformation which 
reduces F,,---,F, toG,,---+,G, reduces F to G. 

The problem now resolves into the simultaneous equivalence in @ of 

8. Transformations which bring sums of pth powers, p=3, into sums of 
pth powers. We shall prove the following theorem: 

THEOREM XI. Let @ be an (n, p)-proper field. If the non-singular linear 
transformation x; =b;;y; brings a sum of pth powers aix;”, 
p=3), into a sum of pth powers ciy;”, (c:#0;i=1, - - - , n), the matrix B =(b;;) 
of order n has exactly one element in each row and column different from zero. 

Applying the transformation x; =b;;y; to F =a,«;” and equating to c;y,? we 
obtain 


(20) = 


Since ¢ is (, p)-proper we can write (20) in the matrix form 


| 
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where (6;;) =(6;,) = - - - =(6;,) =J, and J is a Kronecker delta. Now 
Since the element ¢5n5n - - - 6 is the only non-vanishing element of the 


matrix on the right of (22), it is of principal determinant rank 1, therefore 
this is true of the left matrix of (22). We can write the latter matrix as the 
product 


(23) (asbirbix - + Dim) (big) 
of ordinary 2-way matrices, where 7 is the column index of 
N = (aibisbiz Dim), 
and k - - - mis the partition of indices associated with the rows of N (in (bj) 
i and g are row and column indices respectively). Since B is non-singular, 


(23) implies that N is of rank 1. Every second order minor of N can be written 
in the form 


(24) 


sk," ** Dem, Deke Dams 


By a lemma of another paper* since B is non-singular, the 2-way matrix 
Q=(b;.---bim) is non-singular, where the partitions r=j---/ and 
o=k---m are the partitions of the rows and columns of Q. Since the 
determinants in (24) for a given r, s, (r#s), form the class of all determi- 
nant minors of the rows of Q obtained by setting 7= - - - =/=r, s it follows 
that all of these determinant minors cannot vanish. Hence 


= 0 
for rs. Since a,, a,#0, we have bb, =0 for r#s. Similarly 


for Take for equal to some value ¢. Then 6;,=0 for If 
b#0 for i=f, where f##, then b;.=0 for i¥f; similar conclusions hold for 
bis, - - , Din. 

It is to be noted that if p is a prime and ¢ has characteristic p, any non- 
singular linear transformation brings F into a sum of pth powers. 

Let B=(b6;;) be chosen so that it satisfies the preceding theorem. Apply 
the transformation x;=),;y; to a form c,x;*. We obtain c;(b;;y;)* which is a 
sum of gth powers. We have proved the following theorem: 


* R. Oldenburger, I, p. 625. See also G. W. King, The indeterminate and composite products of 
matrices, Journal of Mathematics and Physics, vol. 13 (1934), p. 445. 
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THEOREM XII. Let ¢ be an (n, p)-proper field. If a non-singular linear trans- 
formation in @ brings a sum of n pth powers of %1,- ++, Xn into such a sum, 
where p=3, this transformation brings a sum of qth powers of %1, , Xn, for 
any q, into a sum of qth powers with the same number of non-vanishing coeff- 
cients. 


This implies the following corollary: 


Corortary I. Let ¢ be an (n, p)-proper field. Let F be a sum of n pth powers 
of %1,° ++, where A set of forms F, R, S,---,T in are 
simultaneously equivalent in @ to sums of powers of x1, - ~:~: , Xn if and only if 
R, S,- ++, T are already sums of powers in , Xn. 


The reader should compare the simplicity of the theory of equivalence 
of r-tuples of forms in %, - - - , x,, where one of the forms is equivalent to 
ayx;?, (a;40;7=1, - - - ,m; p23), with the corresponding theory for the quad- 
ratic case. 

We are now able to prove the following theorem: 


THEOREM XIII. Let ¢ be an (n, p)-proper field. If the non-singular transfor- 


mation x; =b;;y;, (i,j =1, - - -,m), brings a sum of r, (r <n), pth powers F =a;x;?, 
(i=1,---, 7; into a sum of r pth powers (¢=1,---, 7; 
b; #0), then the matrix B =(b;;) is of the form 
Bu 0 
Bz Bas ||’ 


where B,, is a minor of order r with exacily one non-vanishing element in each row 
and column. 


The matrix of F is unique since ¢ is also (r, p)-proper. Applying the trans- 
formation with matrix B to F we obtain 


r 


ai(bi;y;)”, 
Equating to F’, we can write 
Since there are no terms in y,4:, - - - , Yn on the right of (25) we obtain, by 


the restrictions on ¢, 


i=1 


| 
— 
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forj=r+1,---,n;k,---,q=1,---,n.Settingj7=r+1 in the above equa- 
tion and writing it in matrix form we get 


(26) ( bin) (bie) = 0, 
i=l 


where k - - - m is the partition of the rows of the left factor in (26), 7 is the 
index of the columns, and i, g are the row and column indices of (6;,). 
Since B is non-singular the left factor in (26) is of rank zero; whence 
- bim =O (inot summed) 
By an argument used in the proof of Theorem XI, the 2-way matrix 
Q=(bix - - - bim) is non-singular where r=:7 ---/ is the partition of the 


rows of Q and o=k---m the partition of the columns. The products 
bx + - - bim, t not summed, are the elements in the row of Q obtained by 
setting 7= - - - =/=i. Since Q is non-singular, these do not all vanish for 
a given 7. Hence a,b;,,4:=0 for every 7. Similarly a;b;;=0 (7 not summed) 
for 7=r+2,---,m. Since for every i, we have b;;=0 fori=1,--- ,7; 
j=r-1,---+, mn. If B is of this form, the terms on the left of (25) involve 
only vy, -- +, ¥,. Theorem XI now implies that B, has exactly one element 


not zero in each row and column. 


Cororary II. Let ¢ be an (n, p)-proper field. If the non-singular transfor- 
mation x;=b;;y;, (t, 7=1,--+-, 2), brings a sum of pth powers (p23; 
i=1,---,7;a;40), into a sum of pth powers biy;”, (i=1, - - - , 7; 6:0), then 
the transformation brings any sum of qth powers F =cx;*, (t=1,-- + , 7), into 
a sum of qth powers F’ =djx;*, (i=1, ---, 17), where F and F’ have the same 
number of non-vanishing coefficients. 


THEOREM XIV. Let G be a sum of pth powers, (p=3), of m1, --- , %, with 
non-vanishing coefficients. Let F;, (i=1,---, 5), be a sum of gth powers of 
M,°:+*,«,fori=1,---,s, where g=3 and the coefficients of x;” for any i are 
not ail zero in F,,--- , F,. Let be an (n, p)- and (n, q)-proper field. The class 
of non-singular transformations x; =b;;y;, (i, 7=1, - - - ,(m)), (n=r), which bring 
G in into a like sum of powers G’, is identical with the class of transformations 
which bring F\,---,F,intoalike set F{,---,Fi. 


Theorem XIII implies that the transformations bringing G into G’ bring 
F,,---,F, into F{,---, F?. To prove the converse, assume that ¢ con- 
tains an infinite number of elements. Write 

Fi = Fi = $m 


where by our assumptions j1;, - - - , 4s; do not all vanish for a given j. Similar 
properties hold for Vi;, - - - , V.;. Write 


i 
1 
19 
| 
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t=1 i=1 
Choose a set of values a, - - - , a, Of 21, - , Zin @ such that* L;(a), L} (a) 
#0, (j=1,---,7r). Let --- +a, +L,(a)x,*. 
The transform F’=a,F{+ --- +a,F! of F is Li (a)yt+ --- (a)y,2. 
Since none of the coefficients of 2%, - - , ¥17, , in F and F’ vanish, 
by Theorem XIII, G transforms into a like form G’. 

If ¢ is a finite field, embed ¢ in an algebraically closed field y containing it. 
The above argument now applies for y. Since the coefficients of G and the 0;; 
are in ¢, the coefficients of G’ are in ¢. 

9. Equivalence of forms of degree », p=4, to sums of pth powers. We 
shall prove the following theorem: 


THEOREM XV. Let be an (n, p)-proper field. Let F 

-+ Xm be an n-ary form of degree p, (p=4), for which a subform Fz... 
* Xm of degree s, (s=3), is equivalent in to a sum S 
(u: #0), of n pth powers for some set of valuesa,---,é&ofi,---,t. The form 
F is equivalent in } to a sum of n sth powers if and only if under reduction of 
F,...:to S in o, F transforms covariantly into a sum of pth powers. 


Let = denote the set of values a, 8, - - - , 5, p, 7, & of the leading indices 
i,- --,tof F. A study of minors of (a;r), where T is the partition j - - - m, 
reveals that the principal determinant rank of any subform F, (o here de- 
notes a fixed partition) is greater than or equal to the principal determinant 
rank of a subform F; if o is contained in ©. By Theorem I the principal de- 
terminant rank of F,...: is m. Hence this rank of F is n. By Theorem X, F is 
equivalent in ¢ to a sum of » pth powers if and only if Fi, - - - , F, are simul- 
taneously equivalent in ¢ to sums of pth powers. Since a is contained in the 
seta, --- , &, the principal determinant rank of F, is n. Now F, occurs in the 
set Fi, --- ,F,. If F is equivalent in ¢ to a sum of m pth powers, F, is equiva- 
lent in @ to a sum of (p—1)th powers. There are such powers because the 
principal determinant rank of F, is. By Theorem X, F, is equivalent to such 
a sum if and only if Fai, - - - , Fan are simultaneously equivalent in ¢ to sums 
of (p—2)th powers. The subform Fs has principal determinant rank n and 
occurs in this set. Continuing this process, we find that if F is equivalent ing 
to asum of pth powers, F...., is equivalent in ¢ to a sum of n (s+1)th powers. 
By Theorem X, this equivalence is valid if and only if Fag...1, , Fag---rn 
are simultaneously equivalent in ¢ to sums of sth powers. The subform 
Fug...7¢ occurs in this set. 


* O. Haupt, Einfiihrung in die Algebra, vol. 1, p. 166. 
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Reduce F,,...,¢ non-singularly to a sum of sth powers S. Simultaneously 
the remaining forms in the set Fag...11, , Fag-.-rn are transformed into a 
new set g...11, Fag..-:n. By the corollary of Theorem XII, these forms 
are simultaneously equivalent in ¢ to a sum of gth powers if and only if they 
are already such sums. Assume that they are such sums. By a remark near 
the end of §7, the transformation T that reduces Fas---rn, 
to sums of sth powers reduces Fag..., to a sum F%..., of (s+1)th powers. 
Let the result of applying T to Fag...1,--°-°, Fas---pn be denoted by 
Fag--+pn» Since Fig..., is in this set and is already a sum of 
(s+1)th powers, the forms F%s....1,- Foas---pn are, by the corollary of 
Theorem XII, simultaneously equivalent to sums of (s+1)th powers if and 
only if they are already such sums. If they are such sums, by the remark of §7 
referred to above, the transformation T brings Fas..., into a form F.s...», 
which is a sum of (s+2)th powers. Continuing this chain of reasoning we see 
that the 7 transforms F/,---,/F, of Fi, - - - ,F, are simultaneously equiv- 
lent in ¢ to sums of (p—1)th powers if and only if they are already such sums. 
The transformation T then brings F into a sum of pth powers F’. 

Hence if F is equivalent in ¢ to a sum of m pth powers, the transformation 
T which reduces F,...; to a sum of m sth powers must reduce F to a sum of 
n pth powers. The sufficiency of this condition is obvious. 

10. Simultaneous equivalence of forms of degree p, (p=3), to sums of 
pth powers. The assumption made in Theorem XV concerning the existence 
of a subform F,...; of degree s, (s=3), equivalent to a sum of m sth powers is 
evidently not a necessary condition for the equivalence of a form F to a sum 
of m pth powers. It is therefore convenient to have a method of determining 
whether or not a set of forms in variables of degree p, (p= 3), are simultane- 
ously equivalent to sums of pth powers where no form in the set is equivalent 
to a sum of m pth powers. I have developed such a theory which involves the 
theorems of §$§7-9, Theorem XVI below, and other considerations. Since parts 
of the method are complicated, the development will be left to the reader. 
Essential to this theory is the following theorem: 


THEOREM XVI. Let 
By O 
Bei Boo 


B = (6;;) = | 


where By, is a minor of order r with exactly one non-vanishing element in each 
row and column. For the form 
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of degree p, (p=3), to be equivalent in an (n, p)-proper field @ under the non- 
singular transformation 


xi = dizi, i,j =1,---,n, 

with matrix B to a sum of pth powers 
F’ = day? , a=1,---,r+g, °° Ante 

for some g, it is necessary that the form 


be equivalent in o under the non-singular transformation 


with matrix Boo to a form of the type 
L’ = dae? ; a=r+1,---,r+g. 


Assume that F’ is equivalent in ¢ to F for some choice of the \’s and g 
under the transformation 


(28) Vi = jmi,---,8, 


with matrix B. This implies that there exist a transformation (28), \’s, and 
a g such that 


r+g 


a=1 
Since }.;=0 fora=1,---,randi=r+1,---, m, we derive from (29) the 
relations 
r+g 
a=r+l1 


If (30) is satisfied, there exists a non-singular transformation 
Va = daix;, 


and a choice of L’ such that L’ reduces to L. Hence there exists a non-singular 
transformation (27) bringing L into L’. 

It is to be noted that if Fi, - - - , F, are sums of pth powers, then a set 
of forms -- - , - , are simultaneously equivalent in a field 
to sums of pth powers if and only if Fi41, - - - , F, are equivalent to sums of 
pth powers under transformations which bring F;, - - - , F; into sums of pth 
powers. These transformations can be written down by Theorems XIII and 
XIV. 


— 
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11. Equivalence of a binary form to a sum of two pth powers. We shall 
prove a modification of Theorem XV. In this section we shall denote the sub- 
form 


of F=d;...m%i Xm, (i, ,m=1, 2), by F,,. In F,, there are p fixed con- 
secutive subscripts equal to 1 and 7 equal to 2, the remaining subscripts being 
free. We shall need the following lemma: 


Lemna III. Let @ be a (2, p)-proper field. Let F=0;;...mX¥iXj-~ + Xm be a 
binary form of degree p with principal determinant rank 2. If the principal de- 
terminant rank of a subform F,=%1F 41,1 +%2F 5,741 of degree greater than or 
equal to 3 is 2, while this rank is 1 for each of the subforms Fy41,+, F'o,741, the 
form F is equivalent in > to a sum of pth powers if and only if F is already a 
sum of pth powers; then F,,=F. 


The columns of the ordinary 2-way matrix (a;,), where 7 is the partition 


j++, are, by the symmetry of (a;...»), the columns of the matrix 
A’ Qi1---12 Gi1.-..192 Gi1---1292 * Gig..-2 
; 
Go1..-1292 * * Ageg..-2 


whence the rank of A’ is the principal determinant rank of F. Since this rank 
of F is 2, some minor 


N= 


of A’ is non-singular, where for each element of the minor matrix N the first 
subscript is followed by sets of equal numbers, the first containing p 1’s, the 
second ¢ 1’s or 2’s, and the last 7 2’s. N is a minor of the matrix which is 
given by 


(31) M = 


22 1--+1 uv-+-w 2+++2 


where only a typical column of M is displayed in (31). Here the first set of 
equal subscripts contains p 1’s and the last 7 2’s. Setting w=1 and uw=2in M 
we obtain minors M, and M; which can be written as 


M,= = 


The matrix M is composed of the columns of M,; and Mz. The first sub- 
script for each element of M, is followed by (p+1) 1’s, and the set of sub- 
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scripts ends with a set of r 2’s. For the matrix M, the sets have p 1’s and 
(r+1) 2’s, respectively. The only columns that M,; and M, do not have in 
common are the columns of NV. Write 


M, = (& -- &), Mz = £29), N = (&29£1), 


where the é’s are columns of M,, Mz, N. The matrices M, Mi, M, are the 
matrices of the forms F,,, F,4:,,, and F,,-4; of the lemma. We assume that the 
principal determinant ranks of F,4:,,, F,,,4:are 1; whence the ranks of M, and 
M, equal 1. Assume that & 0. Since M, is of rank 1, &, & are linearly de- 
pendent. Since & also occurs among the vectors £41, -- + , & 1, and Mz is 
of rank 1, £ and £2, are linearly dependent. It follows that £, &, are linearly 
dependent contrary to assumption. 

Since &= --- =0, whence £,,:= -- - =£,-1=0, we make use of the 
assumption of the lemma that the degree of F,, is greater than or equal to 3, 
and it therefore follows that there are at least two variables in the set 
u,v,--+, w. The elements 


where the set of subscripts of the first element ends with 7+1 2’s and the sec- 
ond element has p+1 subscripts equal to 1, now occur among the vectors 
2, - - - , &. These elements therefore vanish. Since they are the same as the 
elements 


of &, £,, where the subscripts fall into groups as in N, the form F,, is given by 
the equation 

Since F,, is thus a sum of (0 +1)th powers, (+123), by Theorem XV, F is 
equivalent to a sum of pth powers if and only if it is already such a sum. 

To complete the proof of the lemma we note that if F is a sum of two pth 
powers, the principal determinant rank of every subform F,,4F is 1 or 0; 
whence the form F,, of the lemma is F. 

The characteristic determinant of two quadratic forms F, G in the same va- 
riables x;, x2 with matrices A, B, respectively, is the determinant | 

If F is equivalent to a sum of 2 pth powers and there is no subform F,, 
satisfying the conditions of Lemma III, there is a quadratic form F,, with 
rank 2. For a form F having such a subform we shall prove the following 
strengthened form of Theorems III and VI. It is to be noted that no distinc- 
tion is made between even and odd values of p. 


| 
. | 
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THEOREM XVII. Let ¢ be a (2, p)-proper field. Let F = 
(i, - - - , m=1, 2), be a binary form of degree p, (p=3), not already a sum of 
pth powers. The form F is equivalent in > to a sum of 2 pth powers if and only if 
for some p, t the quadratic subform F ,, is of rank 2, the characteristic determinant 
D of For, Fo-1,141 OY For, Fo41,7-1 factors in into distinct linear factors, and 
under reduction of D to canonical form Kxj x7 in @ F transforms covariantly 
into a sum of pth powers. 


That the rank of F,, equals 2 for some p, 7 was noted above to be a neces- 
sary condition. By the argument used in the proof of Theorem XV the simul- 
taneous equivalence in ¢ of F,,, F,-1,-4: and F,,, F,41,,-1 to sums of squares is 
also a necessary condition. By Theorem X these pairs of forms are simul- 
taneously equvalent in ¢ to sums of squares if and only if the cubic forms 
alent to sums of cubes. By Theorem IIIa this equivalence is possible if and 
only if the characteristic determinants of F,,, Fy,-1,.41 and F,,, Fo41,,-1, re- 
spectively, factor in @ into distinct linear factors. If it is assumed that this 
necessary condition is satisfied for one of the pairs of quadratic subforms, one 
of the cubic forms F,_;,, or F,,,-1 is equivalent in ¢ to a sum of 2 cubes. By 
Theorem III the transformation in ¢, which reduces D to canonical form 
Kxj x? , simultaneously reduces F,_;,, or F,,,-1, as the case may be, to sums of 
cubes. By Theorem XV, F is now equivalent in ¢ to a sum of pth powers if 
and only if under reduction of D to Kx x/. F transforms into a sum of pth 
powers. 
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TWO-DIMENSIONAL SPACES IN WHICH THERE EXIST 
CONTIGUOUS POINTS* 


BY 
E. C. KLIPPLE 


In a recent number of The Rice Institute Pamphlett R. L. Moore has 
formulated a set of axioms in terms of the undefined notions “point,” “re- 
gion,” and “contiguous to.” These axioms{ (Axioms A, B, C, 0, 1, and 2 of 
this paper) serve as the basis for the proofs of a considerable number of 
theorems of ordinary point set theory, including a large proportion of the 
theorems of the first two chapters of Moore’s book.§ Nevertheless, there exist 
spaces satisfying these axioms in which an arc may contain only a finite num- 
ber of points and in which a region may consist of a finite number of points. 

In the present paper a study is made of spaces which satisfy the above 
mentioned axioms and some additional axioms which restrict the spaces to 
being, in a certain sense, two dimensional]. The ordinary euclidean plane is a 
space which satisfies all the axioms. 

I wish to acknowledge my indebtedness to Professor R. L. Moore, and to 
thank him for suggesting the problem and for many helpful criticisms in the 
course of its development. . 

For definitions of terms used but not defined here the reader may refer to 
S.C.P. 


DeFIniTions. A simple closed curve is a compact continuum, containing at 
least two distinct non-contiguous points, which is disconnected by the omission 
of any two of its non-contiguous points. A triune is a set of three distinct points 
such that each of them is contiguous to each of the others. 


Axiom A. No point is contiguous to itself. 


Axiom B. If the point A is contiguous to the point B, then B is contiguous 
lo A. 


* Presented to the Society, October 28, 1933; received by the editors August 18, 1937. 

Tf Vol. 23 (1936), no. 1. In the present treatment the abbreviation S.C.P. will be used to designate 
part 1 of this paper. 

t It being understood that in the statement of Axiom 2 of S.C.P. the word “non-degenerate” is 
to be omitted. It is clear from the context that the retention of this word was not intended. 

§ Foundations of Point Set Theory, American Mathematical Society Colloquium Publications, 
vol. 13, New York, 1932. In the present treatment the abbreviation P.S.T. will be used to designate 
this book. 
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Axiom C. If M is a closed point set and every point of the set H 1s contiguous 
to some point of M, then no point of S—M is a limit point of H. 


Axiom 0. Every region is a point set. 


Axiom 1. There exists a sequence Gy, Go, G3,--- such that (1) for each n, 
G,, is a collection of regions covering S, (2) for each n, G,4: is a subcollection of Gn, 
(3) if Ris any region whatsoever, and if X is a point of R and Y is a point of R 
either identical with X or not, then there exists a natural number m such that if g 
is any region belonging to the collection Gn and containing X, then g is a subset 
of (R—Y)+X, (4) if Mi, Me, M3,--+- is a sequence of closed point sets such 
that M,, contains M,,4,: for each n and there exists a region g,, of the collection G, 
such that M,,is a subset of Z, for each n, then there is at least one point common to 
all the point sets of the sequence M;, M2, M;,---. 


Axiom 2. If P is a point of a region R, there exists a connected domain con- 
taining P and lying in R. 

Axiom 3. If J is a simple closed curve or triune, then S —J is the sum of two 
mutually separated connected point sets such that J is the boundary of each of 
them. 


In this paper I deal very frequently with complementary domains of 
simple closed curves and triunes. If J is a simple closed curve or triune and 
w is a point of S—J, then by the interior of J with respect to w, is meant that 
complementary domain of J which does not contain w. Similarly, that com- 
plementary domain of J which contains w is called the exterior of J with respect 
to w. In case no ambiguity arises, the terms interior and exterior of J will be 
used without making specific reference to a point w. 


THEOREM 1. Let J; and Jz denote two point sets each of which is either a 
simple closed curve or a triune. Let I, and I, denote the interiors of J, and Jo, 
respectively, and suppose Jz is a subset of Ji Then I, is a subset of 


THEOREM 2. If J is a simple closed curve, and AB separate C and D on J, 
and AXB is an arc such that the segment AXB is a subset of I, the interior of J, 
and no point of the segment AXB is constiguous to any point of J—(A+B), 
then (1) I;, the interior of AXBCA, is a subset of I, (2) the segment ADB is a 
subset of the exterior of AXBCA, and (3) I, has no point in common with Is, 
the interior of AXBDA. 


THEOREM 3. Under the hypothesis of Theorem 2, I =I,+I2+segment AXB. 


Proof. Suppose /(J,+J:+segment A XB) =M, where M is a non-vacuous 
point set. Let P be a point of M. By Theorem 38 of S.C.P. there exists an 
arc PX from P to X lying in J. The arc PX contains an arc PX’ such that 
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PX'—X’ isa subset of M, and X’ is a point of the segment A XB. Similarly, 
there exists an arc Pw lying in Ee, the exterior of AX BDA. The arc Pw con- 
tains an arc PC’, such that PC’—C’ is a subset of M, and C’ is a point of 
the segment ACB. Let T denote the first point in the order from X’ to P 
which X’P has in common with PC’. If no point of the interval X’T of X’P, 
except possibly 7, is contiguous to any point of (TC’—T), where TC’ de- 
notes the interval of PC’ from T to C’, then X’T+T7C’ is an arc from X’ 
to C’. If there exist points of X’T—T which are contiguous to points of 
TC'—T, there must be a first such point in the order from X’ to 7. For 
otherwise, there would be infinitely many such points, and the set of all such 
points would have a limit point in X’T—T. But by Axiom C, every limit 
point of such a set must belong to TC’, and a contradiction is reached. Let W 
denote the first point of X’T —T which is contiguous to a point of TC’—T. 
By Axiom C, W is contiguous to only a finite number of points of TC’—T. 
Let V denote the last point in the order from T to C’ which is contiguous to 
W. Let X’W denote the interval of X’T from X’ to W or the point X’ accord- 
ing as W is not or is identical with X’. Let VC’ denote the interval of TC’ 
from V to C’ or the point C’ according as V is not or is identical with C’. 
The point set X’W+VC’ is an arc containing at least three points. Thus, in 
any case, the point set PX’+ PC’ contains an arc X’P’C’, such that the seg- 
ment X’P’C’ contains at Jeast one point and is a subset of M. Similarly, we 
may show the existence of an arc X’’P’’C’’ such that (1) the segment 
X"’P’’C”’ contains at least one point and is a subset of J; and (2) the points 
X”’ and C” are points of the segments AXB and ACB, respectively. Let 
X’X"’ denote the point X’ or the arc of the segment AXB from X’ to X”, 
according as X”’ is or is not identical with X’. Also let C’C’’ denote the point 
C’ or the arc of the segment ACB from C’ to C’’, according as C”’ is or is not 
identical with C’. By means of repeated applications of Axiom C it may be 
shown that the point set X’P’C’+C’C" +X"P"C"+X'X” contains a sim- 
ple closed curve J’, which contains at least one point of each of the segments 
X’'P’C’ and X"’P’'C"’. Let I’ denote the interior of J’. Now I’ is a subset 
of J by Theorem 1. Thus the segment A DB is a subset of E’, the exterior of J’. 
Since the connected set J, plus the segment ADB contains no point of J’ but 
does contain a point of E’, it follows that J, plus the segment ADB is a sub- 
set of £’. Furthermore J’ cannot contain a point of the segment A XB. For, 
suppose J’ contains the point F of the segment AXB. The connected set 
F+J, contains no point of J’ but contains points of both complementary 
domains of J’. We conclude from this contradiction that the segment AXB 
is a subset of J’+E’. Now J’ cannot be a subset of M since there exists a 
point of J’-J, which is either a limit point of J’ or contiguous to a point of I’, 
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and M and J, are two mutually separated domains. Also, J’ cannot be a sub- 
set of J; since there exists a point of J’- M which is either a limit point of J’ 
or contiguous to a point of J’, and M and J, are two mutually separated 
domains. Also J’ cannot be a subset of M+/J, and contain points of both M 
and J, since I’ would thus be the sum of two mutually separated sets, con- 
trary to Axiom 3. Thus in any case we reached a contradiction, and the theo- 
rem is established. 


THEOREM 4. If the points A and B. separate the points C and D on the sim- 
ple closed curve J, and if the segments AXB and CYD are both subsets of I, the 
interior of J, then these segments have at least one point in common. 


Proof. Suppose the segments AXB and CYD have no point in common. 
There exists an arc A’w such that A’w—A’ is a subset of E, the exterior of J, 
and such that A’ is a point of the segment CAD of J. Similarly, there exists 
an arc B’w such that B’w—B’ is a subset of E and B’ is a point of the segment 
CBD. The point set A’w+B’w contains an arc A’X’B’ such that the segment 
A’X’B’ is a subset of E. Let AA’ denote the point A or the arc of the seg- 
ment CAD from A to A’ according as A’ is or is not A, and let BB’ denote the 
point B or the arc of the segment CBD from B to B’ according as B’ is or is 
not B. It may be shown that there exists a simple closed curve J* satisfying 
the following conditions: (1) J* is a subset of AA’+BB’+AXB+A'X’B’, 
(2) J* contains at least one point of each of the segments A XB and A’X’B’, 
and (3) J*-J is the sum of two mutually separated continua which separate 
C and D on J; therefore J —J*-J =g,+g2, where g; and gz are mutually sepa- 
rated segments containing C and D, respectively. Since J* contains no point 
of the arc CYD, it follows that the connected point set CV D+ ¢,+ 2 is a sub- 
set of J*, a complementary domain of J*. Thus J is a subset of J*+J*. Hence, 
by Theorem 1, either J or E is a subset of J*. But each of the sets J and E 
contains a point of J*, and J* contains no point of /*. Thus we reach a con- 
tradiction, and the theorem is established. 


THEOREM 5. If J is a simple closed curve or triune, then I, the interior of J, 
contains infinitely many points. 


Proof. Suppose J contains exactly m points, where m denotes a natural 
number. Thus every point of J is contiguous to some point of J. If there are 
infinitely many points of J, the closed point set J contains a limit point of J 
by Axiom C. But this is impossible since J is closed. Hence J contains only a 
finite number of points. Let A and B be two contiguous points of J, and let Xo 
be a point of J/—(A+B). Let X be a point of J which is contiguous to A and 
Y a point of J contiguous to B. Let XY denote the point X or an arc from X 
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to Y lying in J, according as Y is or is not identical with X. The point set 
XY+A+B contains a simple closed curve or triune J; which is a subset of 
J +I and contains A, B, and a point X;, of J but is such that Xo is in the ex- 
terior of J;. Similarly, there exists a simple closed curve or triune Jz which is 
a subset of J; plus its interior and contains A, B, and a point X; of the interior 
of J; but is such that X; is in the exterior of Jz. By continuing the indicated 
process +1 times we reach a contradiction, since X;, X2, X3,--- , X,-1 are 
distinct points of J. Hence J must contain infinitely many points. 


THEOREM 6. Let J denote a simple closed curve whose interior I contains 
a point P which is contiguous to at least three distinct points of J. Let 


P,, Po, +++, Pn, (n=3), be points of J (in the order indicated if n>3), and 
let B=P,+P2+---+P,. Suppose B is the set of all points of J that are 
contiguous to P. Let PyPix:, (R=1, 2,---, m), denote that arc of J which 
contains only the points P, and P+; of the set B, and let Pns1, denote P;. Let I;, 
(k=1, 2,---, m), denote the interior of the triune or simple closed curve 
P+P Piss. Then I= P+) 

Proof. By Theorem 1, J;, (k=1, 2, -- +, 2), is a subset of J. Also, if k¥7, 


then J, and J; are mutually exclusive. Suppose J — (P+) =M, where 
M is a non-vacuous point set, and let 14, denote a component of M. Now M 
is a domain; hence M, is a domain. Let a denote the set of all points of J, 
each of which is either a limit point of M, or contiguous to a point of M,. 
There exists at least one point of a. For let Q, denote any point of M,, and 
let Q\w denote an arc from (Q) to w lying in the exterior of the simple closed 
curve or triune P+P,P2. Let Q/ denote the first point that Qw has, in order 
Q, to w, in common with J. Now Q is obviously a point of a, since Q/ is 
the first point of Qw, in order Q, to w, which does not belong to M,. 

Suppose there exist two points X and Y of a, which do not lie together 
on one of the arcs P;P2, P2P;,-- +, P»P:. Since X and Y are non-contiguous 
points of J, it follows that J is the sum of two arcs from X to Y having noth- 
ing in common except their end points such that the corresponding segments 
of these arcs are mutually separated. There exists a point P, of 6 on one of 
these segments and a point P, of 6 on the other. Since P, and P, are both 
points of 8, P,+P+P, is an arc lying in 7+J—(M+X-+Y). There exists an 
arc P,OP, which is a subset of /+E—(X+Y), where E denotes the exterior 
of J, and is such that P,OP,-J is the sum of two mutually separated con- 
nected point sets M, and M which contain P, and P,, respectively, but no 
other points of 8, and which separate X and Y on J. The point set P+P,OP, 
is a simple closed curve J’ which contains P and is such that J’-J is the sum 
of two mutually separated connected point sets which separate X and Y on J. 
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Thus J—J’-J is the sum of two mutually separated segments gx and gy 
which contain X and Y, respectively. Now J’ separates X from IY, for other- 
wise gx and gy would lie together in J’, a complementary domain of J’; there- 
fore by Theorem 1, either J or E would be a subset of J’, which is contrary to 
the fact that both J and E contain points of J’. Since X and Y are boundary 
points of M, lying in different complementary domains of J’, it follows that 
there are points of M, lying in different complementary domains of J’. This 
is impossible since M, contains no point of J’. From this contradiction we 
conclude that if X and Y are any two distinct points of a, then there exists 
an arc of the set PiP2, P2P3,---, P,P: which contains both X and Y. 

Thus, if J/—8 contains a point X of a, then the arc of the set PiPo, 
P2P;,--+, P,P; which contains X must contain all of a. Now if a is not a 
subset of one of the arcs of the set PiP2, P2P3,---, P,P, it follows that 
J—8 contains no point of a. Furthermore, it follows that n=3 and 
a=8=P,+P.2+P;. In this case, since J is not a triune, one of the seg- 
ments P;P:2, P2P;, P;P; exists. Suppose the segment P;P; is non-vacuous. 
Using methods similar to those used in the early part of the proof to get J’, 
we obtain a simple closed curve J* satisfying the following conditions: (1) J* 
is a subset of P+J+J;+E£, (2) J* contains P, Pe, a point of the segment 
P;P,, a point of E, and a point of J;, and (3) J*-J is the sum of two mutually 
separated connected point sets which separate P; and P; on J, and J—J*-J 
is therefore the sum of two mutually separated segments, one containing P; 
and the other containing P;. The argument used above to show that J’ sepa- 
rates X and Y may be applied here to show that J* separates P; and P3. 
Hence J* separates two points P/ and Pj of M,, since P; and P; are points 
of a. But this is impossible since M, is connected and contains no point of J*. 
Thus we reach a contradiction, and we conclude that there exists an arc 
PP of the set of arcs P:P2, P2P3, - - - , which contains a. 

Let E£; denote the exterior of the simple closed curve or triune (P+ P;P 41). 
The domain E; contains M,, and we write E; = M,+(E;—M,;). The boundary 
of M; is a subset of (P+P;P;4:). Hence M, and E;—M, are mutually sepa- 
rated, contrary to Axiom 3. Thus we reach a contradiction, and the theorem 
is established. 


THEOREM 7. Let Pi, Ps, --- , Pa, (n2=3), be points of the simple closed curve 
J (in the order indicated if n>3). Let Ai, Ao, +--+, Ax, (R22), be points of the 
arc A;A;,, (in the order indicated if k>2) where A,A;, is a subset of I, the interior 
of J. Let B=P,+Po+ and let y=A,+Aot+ +Ax. Suppose B is 
the set of all points of J each of which is contiguous to at least one point of A:Ax, 
and suppose y is the set of all points of A,A;, each of which is contiguous to at 
least one point of B. Suppose furthermore that no point of B is contiguous to both 
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A, and A, but that P, is contiguous to A, and to no other point of y and P,, is 
contiguous to A; and to no other point of y. It follows that if 7 <j’, and if P; 
is contiguous to A; and Pj; is contiguous to Ay, then isi’. Let P;Pij41, 
(j=1, 2,--- , m), denote that arc of J from P; to Pix, which contains no 
other point of B, where denotes Py. Let ((=1, 2,--- , R—1), denote 
the arc of from A; to A j4;. Let I;, (j =1, 2, - - - , m—1), denote the interior 
of Cj, where C; denotes the triune or simple closed curve (P;P j4:+-A;) in case P; 
and P;,; are both contiguous to A;, or C; denotes the simple closed curve 
(P;Pijx:+AiAigi) in case no point of y is contiguous to both P; and Pix, 
but where P; is contiguous to A; and Pj, is contiguous to Aj41. Let I,, denote 
the interior of the simple closed curve (P,P,+AiA,). If P; is contiguous to each 


of the points A;,;, , let Tin, , Lin; denote the interiors of 
(P;+Aj;,A (Pj+4A i;42), and (P;+A respectively. 
Then 


n n kj 
T=A\A,+ 
j=1 


j=1 t=1 
where I;,is a null set if P; is contiguous to only one point of y. 


The theorem is illustrated by Fig. 1. 


Proof. First I shall prove the assertion made in the statement of the theo- 
rem to the effect that if 7<j’, and if P; is contiguous to A; and P; is con- 
tiguous to A,, then i<i’. Suppose there exist two integers j and 7’ such that 
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j<j', P; and P; are contiguous to A; and respectively, and It fol- 
lows that i>1 and i’<k. Also since P; is contiguous to A; but to no other 
point of y, we see that 1 <j. Again, since P, is contiguous to A, but to no 
other point of y, we see that j’<n. Thus 1<j<j’<m; hence P; and P;- sepa- 
rate P; and P, on J. Let A;, be the point of y with lowest subscript which 
is contiguous to P;, and let A;, be the point of y with the highest subscript 
which is contiguous to P;. Let A1A;, denote the arc A,A;, of A:A, or the - 
point A; according as i; is not or is 1, and let A;,A; denote the arc A;,A;, 
of A,A, or the point A, according as 7, is not or is k. Now P,+A,Ai, +P} 
and P;+A;,A,.+P, are arcs satisfying the conditions of Theorem 4. Hence 
A,A,, and A;,A; have a point in common. But 1 Si/ <i’ <i<i,Sk; there- 
fore A,A;, and A;,A, cannot have a point in common. Thus we reach a con- 
tradiction, and we conclude that if 7 <j’, and if P; and P; are contiguous to 
A;and A», respectively, then i <i’. 

This result enables us to verify the following implications made in the 
statement of the theorem: (1) there do not exist two distinct points of y each 
of which is contiguous to both P; and P;,:; (2) if no point of y is contiguous 
to both P; and P;,:, then there exists an integer 7 such that A; is contiguous 
to P; and A;4; is contiguous to P;,:; and (3) if no point of 8 is contiguous to 
both A; and A ;4:, then there exists an integer 7 such that j7<m and P; and 
P 4; are contiguous to A; and A 41, respectively. 

Suppose Ji Jit) =M is a non-vacuous point 
set. Let M, be a component of the domain M, and let a denote the boundary 
of M,. Obviously a is a subset of J+A,A,. Furthermore, a contains at least 
one point of J and at least one point of A,A,. For let P be a point of M,, 
and let Pw be an arc from P to w lying in the exterior of (A:A;.+ P,P). Now 
Pw obviously contains a point of a: J. Similarly, let PA; denote an arc from P 
to A, which lies in 7. Now PA, contains a point of a-A,A,. 

Suppose J contains two non-contiguous points X and Y of a. If X and Y 
do not lie together on any one of the arcs P;P2, P2P3,--- , P,»P:, there exist 
two points P, and P, of 8 which separate X and Y on J; hence there exists a 
simple closed curve J’ such that: (1) J’ is a subset of E+J+A,A;; (2) 
J’-J =M,+M, where M, and M, are two mutually separated continua which 
contain P, and P,, respectively, and which separate X and Y on J; therefore 
J—J':-J =gx+gy where gx and gy are mutually separated segments contain- 
ing X and Y, respectively; (3) J’ contains at least one point O of E and at 
least one point of A,A,. If X and Y lie together on the arc P;Pj4; of the set 
P,P2, P2P;,-- +, P,P1, there exists a simple closed curve J’ such that: (1) J’ 
is a subset of plus the segment XY of —P;Pi41); 
(2) J’-J is the sum of two mutually separated continua which separate X 
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and Y on J; hence J—J’- J =gx+gy where gx and gy are mutually separated 
segments containing X and I, respectively; (3) J’ contains at least one point 
O of E and at least one point of A,A,. Now, in either case it may be readily 
shown that J’ separates X from Y and hence that J’ separates two points 
X’ and Y’ of M,. But this is impossible since M, is connected and contains 
no point of J’. From this contradiction we conclude that a-J consists either 
of a single point or of two contiguous points. By an analogous argument it 
may be shown that a-A,A;,, consists either of a single point or of two con- 
tiguous points. 

Now I wish to show that a is a subset of a simple closed curve or triune J*, 
which is a subset of J +A,A,, satisfying the condition that either there exists 
an integer j such that /; is the interior of J*, or there exists a pair of integers 
(j, #) such that J;, is the interior of J*. If a-J is a subset of P,P, then 
J*=P,P,+A,A,. If a-J is not a subset of P,,P:, we shall use the following 


procedure. 
Case 1. Suppose that P;Pj,:, (74m), is the only arc of the set P,Po, 
P2P;,--+, P,P; which contains a-J. It follows that if X denotes either 


P; or Piy1, then P;Pj4:—X contains at least one point Z of a. If P; and 
Pj,, are both contiguous to the point A; of y, then we may show that 
A;=a-A,A,. Suppose the contrary. Let Y denote a point of a-A,A, which 
is different from A;. Now Y either precedes or follows A;, in the order A, 
to A,. Suppose Y precedes A;. Then «#1 and j7#1, and F is not contiguous 
to Pj4;:. There exists a simple closed curve J’ having the following properties: 
(1) J’ isa subset of J+A,1A;4+/,; (2) J’-J is a connected point set such that 
the connected point set J—J’-J contains P,,+(P;Pj4:—P;); (3) J’-AiAx isa 
connected set such that A,A;,—J’-A,A; is either a connected set containing 
Y and A,, or the sum of two mutually separated connected sets, one contain- 
ing Y and A, and the other containing A;; (4) J’ contains at least one point 
of J, and at least one point of J—P,P;. By means of these four conditions 
imposed on J’ we may readily show that J’ separates Y from Z and hence 
that J’ separates two points Y’ and Z’ of M,. But this is impossible since M, 
is connected and contains no point of J’. Thus we conclude that Y cannot 
precede A; on A,A,. Similarly Y cannot follow A; on A,A;. Therefore 
A;=a-A,A,. Thus (A;+P;Pj4:) is a simple closed curve or triune J* which 
contains a and whose interior is /;. 

If there exists no point of y which is contiguous to both P; and P;,:, then 
there exists an integer i such that P; is contiguous to A; and Pj; is con- 
tiguous to A;4:. In this case we show that a-A,A;, is a subset of A;A i4:. Sup- 
pose the contrary. Let Y denote a point of a-A,A;, not on A;Ai4:. Either Y 
precedes A; or Y follows A ;4:, in the order A; to A;. Suppose Y precedes A;. 
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Then i#1 and 7#1 or m. Now there exists a simple closed curve J’ hav- 
ing the following properties: (1) J’ is a subset of J+AiAit+J,; (2) J’-J 
is a connected point set such that the connected set J—J’-J contains 
(3) J’-A1Ax is connected, and A,A;,—J’- is the sum 
of two mutually separated connected point sets, one containing Y and Ai, 
and the other containing A;,, and A,; (4) J’ contains at least one point of J, 
and at least one point of /—P,P;. Thus again J’ separates Y from Z and 
hence separates two points Y’ and Z’ of M,. Again we reach a contradiction, 
and we conclude that Y cannot precede A;. The same argument applies to 
show that Y cannot follow Aji4:. Hence A.A, is a subset of A;Aiyi, and 
(P;Pj4:+A Ais) is therefore a simple closed curve J* containing a. The in- 
terior of J* is J;. 

Case 2. If there exists no integer r such that P,P,4,; is the only arc of the 
set Pi P2P3,-- +, which contains a-J, it follows that a-J is a point 
P; of 8. The case in which a-J =P, has been disposed of in the paragraph 
preceding Case 1. Hence suppose j7 <u. Let A; be the point of y with lowest 
subscript which is contiguous to P;, and let A,’ be the point of y with highest 
subscript which is contiguous to P;. By means of an argument like that used 
in Case 1, it may be shown that no point of a- AA, precedes A; or follows A ;’ 
in the order from A, to Ax. If i=7’, then a-A1A,=A;. Thus (P;Pj41+A,) or 
(P;P j4: +A:A igi), according as P; and Pj; are or are not both contiguous 
to A;, is a triune or simple closed curve J* which contains a and whose in- 
terior is J;. If iz’, then the arc A;A,’ of AiA, contains a-A,A,. Further- 
more, since a-A,A, consists either of a single point or of two contiguous 
points, it follows that a-A,A,; is a subset of one of the arcs of the set 
ins, AigtA ize, + , Suppose the arc A contains a: 
Then since A,, <i’), is contiguous to P;, it follows that (P;+A 
is a simple closed curve or triune J* which contains a and whose interior 
is Z;,. Thus for any case it has been shown that a is a subset of a simple 
closed curve or triune J* satisfying the condition that either there exists an 
integer 7 such that J; is the interior of J*, or there exists a pair of integers 
(j, 4) such that J;, is the interior of J*. 

Let E* denote the exterior of J*. Now E* contains M,; hence E*=M, 
+(E*—M;,), where both M, and E*— M, are non-vacuous point sets neither 
of which contains a point of a. Thus E* is the sum of two mutually separated 
sets, contrary to Axiom 3; and the theorem is established. 


THEOREM 8. Let the following changes, and none other, be made in the hy- 
potheses of Theorem 7: A;, Ao,-- +, Ax, (R21), are points of an arc (in 
the order indicated if k>1) where A,T —T is a subset of I and T is a point of 
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the segment P,,P;, (n=2). The set-y=A,+A2t+ --- +A, is the set of all points 
of AiT —T which are contiguous to points of J; B=P,+P2+ --- +P, is the 
set of all points of J which are contiguous to points of y. The point P,, is no longer 
restricted to be contiguous to A, alone of the set y; I, denotes the interior of 
P,T+A,T; and Io denotes the interior of A\T+TP,. Then 


n n kj 
j=0 j=1 t=1 
If B=Pi+P2+P3+ and y=A,+A2+A3+ - , where B and y are infi- 
nite sets each having T as its only limit point, then 


kj 
UL li. 
j=0 j=1 t=1 
THEOREM 9. Let the following changes, and none other, be made in the hy- 
potheses of Theorem 7: V, Ai, A2,---, Ax, T, (R21), are points of the arc 
VA\T in the order indicated, where the segment VA,T is a subset of I, and 
where V and T are points of J in the order V, P;, Po, ---, Pn, T, (n21). The 
set y=A,+Ae2+ --- +A; is the set of all points of the segment VA,T which 
are contiguous to points of J, and B=P,+P2+ --- +P, is the set of all points 
of J which are contiguous to points of y. The points P, and P,, are no longer re- 
Stricted as to the number of points of y to which they are contiguous. If Io, In, 
and I* denote the interiors of (P,.T+AxT), and (TV+VAjT), 
respectively, where TV is that arc of J not containing P,, then 


n n kj 
I = segment + I* + 1;+ 
j=0 j=1 t=1 
If andy=A\+A2+A3+ are infinite sets each hav- 
ing T as its only limit point, then 


kj 
I = segment + I* + 1; + 
j=0 j=l t=1 
A similar formula holds for the case in which each of the points V and T 
is approached sequentially by a sequence from 8 and by a sequence from y. 


THEOREM 10. Let the following changes be made in the hypotheses of Theo- 
rem 7: P, is contiguous to both A, and A, but to no other point of y. Either each 
of the points A, and A, is contiguous to some point of B—P, or else k>2. Then 
either it is true that if P; and P; are contiguous to A; and A,,, respectively, and 
if j<j’, then isi’, or it is true that if P; and P; are contiguous to A; and Ay, 
respectively, and if j <j’, then i2i’. If the former condition holds, if I* denotes 
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the interior of A:A;.+P,, and if I,, denotes the point set obtained by substituting n 
for j in the definition of I; given in the statement of Theorem 7, then 


n kj 


j=1 j=2 t=l 


The theorem is illustrated by Fig. 2. 


Fic. 2 


If in Theorem 10, 8=Pi1+P:2, where P; and P; are non-contiguous, then 
J is the sum uf two arcs, P:X P2 and PiY P2, and is such that the correspond- 
ing segments are mutually separated. Suppose k=3. (1) If at least one 
of the points A, and A;, say Aj, is contiguous to P2, then by Theorem 3, 
I—A,=D,+Dnz, where D; and D; are the interiors of the simple closed curves 
P\XP2+A, and PiYP2+4A,, respectively. Let J; denote that domain above 
which does not contain A,. Suppose J;=D,. Let J2 denote the interior of 
P\VP.,+A, or of Pi¥P2+A;,1Ax, according as A; is or is not contiguous 
to P». (2) If neither A; nor A, is contiguous to P2, then J—A,A,=D*+D**, 
where D* and D** are the interiors of P2+A1A2 and Pi re- 
spectively. Let J; denote the domain above which does not contain A;. Sup- 
pose J, = D*. Let J, denote the interior of P,}Y P2+A,-1A,. Then in either of 
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these two cases we have, in accordance with the above notation, the following 
theorem: 


This theorem is illustrated by Fig. 3. 


Fic. 3 


THEOREM 12. Let J and C denote two point sets, each of which is either a 
simple closed curve or a triune. Let I and D denote the interiors of J and C, re- 
spectively, and suppose P is a point of I-D. Then there exists a simple closed 
curve or triune Q such that (1) Q is a subset of J+-C, and (2) the interior of Q 
contains P and is a subset of I- D. 


Indication of proof. If at least one of the two sets J and C is a triune, one 
of the sets is a subset of the other plus its interior and hence will have the 
properties required of Q. 

A similar situation exists if both J and C are simple closed curves and one 
of them is a subset of the other plus its interior. 

If neither of the simple closed curves J and C is a subset of the other plus 
its interior, let A be a point of J-D, and let PA be an arc from P to A lying 
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wholly in D. Let O denote the first point, in the order P to A, that PA has 
in common with /. I wish to show the existence of a simple closed curve or 
triune Q’ such that (1) Q’ is a subset of C+J-D, (2) Q’-C is connected, 
(3) Q’-J-D is connected and contains O, and (4) the interior of Q’ contains 
(PO—O). Obviously, the existence of Q’ will be established if it can be shown 
that there exists a point set K (which shall contain Q’-J-D) which is a con- 
nected subset of J containing O and which satisfies with respect to C those 
conditions satisfied by any one of the point sets P, AiAx, segment AXB, 
(A,T—T), segment VAT, AiA;, or AiA; with respect to the corresponding 
given simple closed curve of Theorems 6, 7, 3, 8, 9, 10, and 11, respectively. 
There are several cases. 

Case 1. Suppose O is not contiguous to any point of C. Let B denote a 
point of J-E. The set J is the sum of two arcs from O to B, say OXB and 
OYB. Let M denote the closed point set consisting of C together with all 
points Z of D such that Z is contiguous to at least one point of C. Let A; and 
A: denote the first points that OXB and OYB have in common with M, in the 
order O to B. If both A; and A: are points of C, the hypothesis of Theorem 3 
is satisfied, and the segment A,OA¢z is the desired point set K. 

Case 2. Suppose O is not contiguous to any point of C and that one of 
the points A; and Az described in Case 1, say Ax, is a point of D, while the 
other, Ao, is a point of C. If A; is contiguous to two or more points of C, then 
the hypotheses of Theorem 8 are satisfied,and (A,0A2— Az) is the desired point 
set K. If A; is contiguous to only one point F of C, and F is not contiguous to 
Az, then the hypotheses of Theorem 3 are satisfied, and (A,;OA2— Az) is the de- 
sired point set K. If F is contiguous to Az, let Aj denote the first point of 
OXB, in the order O to B, which is either a point of D that is contiguous to a 
point of C—F or else a point of C. Now if AY is a point of D, the hypothesis 
of Theorem 8 is satisfied, and (Ai OA:—Az) is the desired point set K. If A/ 
is a point of C, the hypothesis of Theorem 9 is satisfied, and the segment 
A,OAz is the desired point set K. 

The remaining cases may be treated with similar methods. 

If Q’ is a triune, Q’ has the properties required of Q, since the interior 
of Q’ cannot contain any points of J. (No simple closed curve may contain a 
point of each complementary domain of a triune.) 

If Q’ is a simple closed curve whose interior contains no point of J, then 
Q’=Q. If the interior of Q’ contains any points of J, let M denote the set 
of all such points of J. Now, M+(Q’ contains a simple closed curve Q having 
the required properties. The proof of this last statement is little different from 
the proof for spaces in which there do not exist contiguous points. 

As an immediate result of Theorem 12 we have the following theorem: 
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THEOREM 13. Let J;, Jo, -- - , J, denote n point sets each of which is either 
a simple closed curve or a triune; and suppose that I,, Iz, --- , In, the interiors 
of Ji, Jo, >> ~ , In, respectively, have a point O in common. Then there exists a 
simple closed curve or triune J which is a subset of Ji+Je+ --- +Jn and is 
such that I, the interior of J, is a subset of I,-Iz- --- -I, and contains O. 


THEOREM 14. Suppose each of the sets J, and Jz is either a simple closed 
curve or a triune. Let I, and Iz denote the interiors of J, and Jo, respectively, and 
suppose I,+I, is connected. Then there exists a simple closed curve or triune J 
which is a subset of Ji +-J2 and whose interior is a subset of I,- Is. 


Indication of proof. Assume the theorem false. Then by Theorem 12 
I,-I,=0. Hence there exist points P; and P: of J; and J2, respectively, such 
that P; is contiguous to P2. Thus P; is a point of Jz and P» is a point of J;. 
There exists a point F of J;-J2 and an arc P;F of Jz such that Pi\F—F isa 
subset of J;. Let P2¥ denote an arc of J;. The point set P:\F+P2F contains a 
simple closed curve or triune J’ containing P; and P2, whose interior I’ is a 
subset of J;. The set J’+J’-J2 contains a simple closed curve or triune J 
satisfying the required conditions. Thus the assumption is false, and the theo- 
rem is established. 

With the help of Theorems 12 and 14, the following theorem may be es- 
tablished: 


THEOREM 15. Let J; and Jz denote two point sets, each of which is either a 
simple closed curve or a triune. Let I, and I, denote the interiors of J, and Jo, 
respectively. Suppose I, +I: is connected. Then there exists a simple closed curve 
or triune J which is a subset of J: +J2, and whose interior contains I,+1. 


THEOREM 16. Let J, Jo, - - - , J, denote n point sets each of which is either a 
simple closed curve or a triune. Let I,, I2,---, I, denote the interiors of 
Ji, ++, In, respectively. Suppose --- +I, is connected. Then 
there exists a point set J which is either a simple closed curve or a triune, which 
is a subset of Ji +Jn, and which is such that I, the interior of J, 
contains 


THEOREM 17. [f J is a simple closed curve or triune, then I, the interior of J, 
contains at least one point which is not contiguous to any point of J. 


Proof. Suppose each point of J is contiguous to some point of J. Let P 
be any point of 7. I wish to show the existence of two distinct points of J, 
each contiguous to P. Since J is connected and contains infinitely many 
points, P is either contiguous to a point of J or is a limit point of J. By 
Axiom C, J contains all limit points of 7; hence P must be contiguous to a 
point Q of J. In case there exists a point P; of J which is contiguous to both 
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P and Q, then J;, the interior of the triune PQP,, is a connected subset of J. 
Since J contains all the limit points of J;, it follows that P is contiguous to a 
point 7 of J;. Thus for this case there exist two distinct points (Q and 7) of 
I, each contiguous to P. In case no point of J is contiguous to both P and Q, 
let P, and Q, denote points of J which are contiguous to P and Q, respectively, 
and let P,Q; denote an arc of J. The point set P:0:+P+0Q contains a simple 
closed curve J2, which contains P and Q. Let J, denote the interior of Jz. The 
argument used above may be used here to show that P is contiguous to a 
point T of Is. 

Let QT denote an arc lying in J. The point set 07 +P contains a simple 
closed curve or triune J’ whose interior J’ contains infinitely many points, 
no one contiguous to any point of J. Thus we reach a contradiction, and the 
theorem is established. 

Example 1. In the euclidean plane let C,, C2, and C; denote three circles, 
each tangent externally to each of the others. Denote their centers by P:, Po, 
and P3, respectively, and their radical center by O. Let A1=C,-C2, A2=C2:Cs, 
and A;=C;-C;. Let B,, Be, and B; denote points on the rays OA, OA2, and 
OA;, respectively, such that d(O, B,)=d(O, B.)=d(O, B;), and d(O, By) 
>d(O, A;). Let B;B,, B:B2, and BB; be circular arcs having P;, P2, and P3, 
respectively, as centers and lying on the non-O sides of the lines B;B,, Bi Bo, 
and respectively. Let Bu = B;B, +OB;+0OB,, Bay = 
Bs: = B.B;+OB.+OB;. For each point P of By and each j, (¢=1, 2, 3; 
j=1, 2, 3,---), let Qp:; denote the point of the interval P;P such that 
d(P;, Opi;) =d(P;, Ai)+[d(Pi, P)—d(P;, A,)]/j, and let denote the set 
of all points Qp;;. Let T;, (¢=1, 2, 3), denote C; plus its interior. Let S denote 
the following collection: (1) 7;, (¢=1, 2, 3), is an element of S, and (2) each 
point of the plane not in 7,;+72+T7; is an element of S. For each positive 
integer n, let G, denote the following subsets of S: (1) the interior of each 
circle of radius 1/m or less which neither encloses nor contains a point of 
T:+72:+T; is an element of G,; (2) for each pair (i, 7), (¢=1, 2, 3; 
j=n,n+1,---), the set consisting of T; and all elements of S enclosed by 
8;; is an element of G,. If each element of S is called a “point,” each element 
of G; is called a “region,” and each of the “points” 71, Tz, and T; is “con- 
tiguous to” each of the others, then each of the axioms of this paper is non- 
vacuously satisfied. 

Example 2. In a euclidean space let 51, s2, 53, 54, and ss; denote the spheres 
whose equations are (x—1)?+y?+4+2?=3/2?, =3/2?, 
x? + y?+ (s+ (3)'/?/12)?=3/12?, respectively. Each of these spheres is tangent 
externally to each of the others. For each i, (i=1, --- , 5), let T; denote s; 
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plus its interior. Let S denote the collection defined as follows: (1) For each i, 
(i=1,---,5), T;is an element of S. (2) Let K denote the (x, y)-plane. Each 
point of the unbounded component of K —K(7T,+72+7>3) is an element of S. 
(3) For each set of three distinct positive integers 7, 7, and k, such that each 
integer is less than 6 and at least one integer is either 4 or 5, let M;;, denote 
the plane which contains the centers of the spheres s;, s;, and s;. Each point 
of the bounded component of M;;,—Mi;.(7;+7;+T;) is an element of S. 
For each positive integer , let G,, denote the subsets of S defined as follows: 
(1) The interior of each circle of radius 1/ or less which is a subset of S-M jx 
or of S-K is an element of G,. (2) In each of the sets S-M;;,, construct three 
sequences of segments like the three sequences of segments constructed within 
the triune of Example 1, and in S-K construct three sequences of segments 
like those constructed in the exterior of the triune of Example 1. For each 
pair (i, 7), ((=1,---,5;j7=n, n+1,---), the subset of S consisting of T; 
together with all elements of S—(7i, T2, - - - , Ts) each of which is enclosed 
by 7; plus the jth segment of one of the sequences constructed is an element 
of G,,. If each element of S is called a “point,” each element of G, is called a 
“region,” and each of the “points” 7}, - - - , 7; is “contiguous to” each of the 
others, then all the axioms of this paper are non-vacuously satisfied. 

Example 2 shows that Theorem 6 fails to hold if J denotes a triune in- 
stead of a simple closed curve. 


THEOREM 18. If J is a simple closed curve, if I is the interior of J, and if H 
and K are mutually exclusive compact continua lying in J +I, then no two points 
of H separate two points of K on J. 


Proof. Suppose on the contrary that the points A and B of H separate 
the points C and D of K on J. It follows from Theorem 24 of S.C.P. that K 
contains an irreducible continuum T from K-ACB to K-ADB. By Theorem 
28 of S.C.P., T—T-J is a connected set having boundary points C; and D, 
in segments ACB and ADB, respectively. It may be readily shown that the 
component of S—(J +H) which contains T—T-J contains a segment Dz, 
where C; and Dz are points of segments ACB and ADB, respectively, and 
neither point belongs to H. Thus the arc C;Y D2 contains no point of H and 
lies, except for its end points, in /. 

The above argument may be repeated to show that there exists an arc 
A,X B, which contains no point of CY D2, where the segment A2XB, lies in J, 
and A, and B, separate C; and Dz on J. But this contradicts Theorem 4. 


THEOREM 19. The interior of a simple closed curve or triune is not a subset 
of any simple closed curve or triune. 
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Proof. Let J denote the interior of a simple closed curve or triune J. By 
Theorem 5, J is not a subset of a triune. Suppose J is a subset of a simple 
closed curve C. By Theorem 17, there exists a point P of J which is not con- 
tiguous to any point of J. Now P is either a limit point of S—C or is contigu- 
ous to a point of S—C. If P is a limit point of S—C, let R be a connected 
domain containing P, but containing no point of J, and let Q be a point of 
R-(S—C). If P is not a limit point of S—C, let Q be a point of S—C which is 
contiguous to P. In either case, 0 belongs to 7-(S—C). 


THEOREM 20. If neither of the contiguous points X and Y separates the 
point A from the point B, then their sum does not separate A from B. 


Proof. Suppose S—(X+Y)=Sa+5Sz, where S,4 and Sz are mutually sepa- 
rated sets containing A and B, respectively. There exists an arc a from A 
to B which does not contain Y. The arc a contains X; hence a contains an 
arc AX. Similarly, there exists an arc AY. The point set AX+AY—(X+YP) 
is connected and hence lies in S4. The set AX+AY contains a simple closed 
curve or triune J which contains X and Y and is a subset of S4+X+Y. Let 
Q be any point of J—(X+Y), and let J and E denote the two complementary 
domains of J. Since Q is a point of S,4 and is a boundary point of each of the 
connected sets J and E, it follows that J and E are subsets of S,. Hence Sis a 
subset of S4+X+Y. Thus we reach a contradiction, and the theorem is es- 
tablished. 


Axiom 4. If P and Q are two distinct non-contiguous points, there exists a 
simple curve or triune which separates P from Q. 


With the help of Axiom 4, the Borel-Lebesgue theorem (Theorem 5 of 
S.C.P.), and Theorem 16, the next theorem may be established. 


THEOREM 21. If H and K are two mutually separated compact continua, 
there exists a simple closed curve or triune which separates H from K. 


THEOREM 22. If J is a simple closed curve or triune containing the contigu- 
ous points A and B, I is the interior of J, and P is a point of J+I which is not 
contiguous to A, then there exists a simple closed curve or triune J* satisfying 
the following conditions: (1) It is a subset of J+J. (2) Its intersection with J 
is an arc containing A and B. (3) Its exterior contains P. 


Proof. Suppose the theorem is false. If may be readily shown that there 
exists a simple closed curve or triune which satisfies conditions (1) and (2). 
Furthermore it may be readily shown that if P is any point of J not contigu- 
ous to A, or if P is a point of J not contiguous to A such that there is a simple 
closed curve or triune J’ containing P and satisfying conditions (1) and (2), 
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then there exists a simple closed curve or triune satisfying conditions (1), (2), 
and (3). Hence our supposition implies that P is a point of J, and every simple 
closed curve or triune, satisfying conditions (1) and (2), encloses P. Let PX 
denote an arc from P to some point X of /—(A+B) such that PX-—X isa 
subset of J. Let S; denote the set consisting of P and all points Y of PX -—X 
such that every simple closed curve or triune satisfying conditions (1) and (2) 
encloses the interval PY of PX, and let S: denote PX —S,. By the Dedekind- 
cut proposition (P.S.T., chap. 1, Theorem 64) there exists a point Q which is 
either the last point of S; or the first point of S2, in the order P to X. 

Suppose Q is the first point of S:. There exists a simple closed curve or 
triune J’ satisfying the following conditions: If Q=X, then J’=J; if O#X, 
then J’ contains Q, satisfies conditions (1) and (2), and encloses PO—Q(=S)). 
There exists a third simple closed curve or triune J;, which is a subset of J’ 
plus its interior and is such that J’-J; is an arc containing A and B but not 
containing Q. Thus (Q is exterior to J:; hence PO—(Q either intersects J; or 
is exterior to J;. Both these possibilities are ruled out since J, satisfies condi- 
tions (1) and (2) and hence encloses PQ —(Q. Thus Q is not the first point of S». 

Suppose Q is the last point of S,. By means of an argument analogous to 
that used in the last paragraph, it may be shown that Q is not contiguous to a 
point of S:. Thus Q is a limit point of S,. Now Q cannot be contiguous to both 
A and B; for if such were-the case, the triune ABQ would satisfy conditions 
(1) and (2) and hence would enclose PQ =S;, and, in particular, the point Q. 
Let C denote a point of the pair (A, B) which is not contiguous to Q, and 
let Cw be an arc from C to w lying in J plus its exterior and containing no 
point of J which is contiguous to Q. By Theorem 21, there exists a simple 
closed curve or triune J’ which separates Q from Cw. Since Q is a limit point 
of S2, it follows that J’, the interior of J’, contains a segment QW of S2. There 
exist a point F of the segment QW, and a simple closed curve or triune J; 
which satisfies conditions (1) and (2) and is such that F is in A, the exterior 
of J;. Let J, denote the interior of J;. By Theorem 12, there exists a simple 
closed curve or triune Jz, which is a subset of J;+J’ and whose interior J; 
contains Q and is a subset of J,-7’. There exists a point T of Jz-I’-J,. It 
may be readily shown that the point set J/:—7+(J:—T) contains a simple 
closed curve or triune J* which satisfies conditions (1) and (2). Furthermore, 
since J2+7+£, is a connected point set containing Q and w but no point 
of J*, Q is exterior to J*. Hence Q is not a point of S}. 

Thus we have reached a contradiction and the theorem is established. 


THEOREM 23. The interior of every triune is non-compact and so is the in- 
terior of every simple closed curve which contains two contiguous points. 


a 
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Proof. Let J denote a simple closed curve or triune containing the con- 
tiguous points A and B, and let J denote the interior of J. Suppose that J is 
compact. It follows that J+J is compact, completely separable, and metric. 
Let D,, D2, D3, --- be a sequence of domains which properly covers J+J 
and with respect to which J+I/ is completely separable. Let , denote the 
smallest integer (which exists in view of Theorems 17 and 22) such that 
there exists a simple closed curve or triune J; which is a subset of J+J, 
and such that J-J, is an arc containing A and B, and D,, is exterior to J;. 
Let m, m2, m3,--- be an increasing sequence of positive integers satisfying 
the following conditions: For each integer k>1, n, is the smallest integer 
greater than m,_, such that there exists a simple closed curve or triune J; 
which is a subset of J;_, plus its interior, and such that J;_;-J; is an arc con- 
taining A and B, and D,, is exterior to J;. For each integer 7, let a, denote 
the point J,—(A+B) in case J, is a triune. Otherwise, let a, denote an arc 
P.O, of J.-—(A +B), where Q, is either contiguous to B or else d(Q,, B) <1/r, 
and where P, is either contiguous to A or else d(P,, A) <1/r. For each integer 
n let E,, denote the exterior of J,,. For each integer 7 and each point P of a, 
there exists an integer mp such that E,,, contains P. By the Borel-Lebesgue 
theorem, there exists a finite collection of these domains covering a,. The 
domain of this finite collection with greatest subscript Z,, contains all other 
domains of the finite collection; therefore E, contains a,. Hence a,-J,=0, and 
consequently a,-a,=0. Thus there exists an increasing sequence of integers 
11, %2,%s, such that for each integer k >0, ar,-arx41=0; hence a,,, 
is a sequence of mutually exclusive continua. There exists a subsequence of 
this sequence which converges to a sequential limiting set ZL containing A and 
B. Hence by Theorem 33 of S.C.P., L is a perfect continuum and is therefore 
uncountable. For each integer n, L is a subset of J, plus its interior. Hence 
if L contains a point X other than A or B, then X is contiguous to A or B. 
But the set of all such points X is at most countable by Theorem 14 of S.C.P. 
Hence L is countably infinite or finite. We have thus reached a contradiction, 
and the theorem is established. 


THEOREM 24.* If H and K are two mutually separated, closed and compact 
point sets containing the points A and B, respectively, then there exists a simple 
closed curve or triune J which separates A from B such that J-(H+K)=0. 


Proof. Let k4 and kg denote the components of H and K containing A 
and B, respectively. From Theorem 21 it follows that for each component h 
of H there exists a simple closed curve or triune J;, which separates / from 


* Cf. L. Zoretti, Sur les fonctions analytiques uniformes, Journal de Mathématiques Pures et 
Appliquées, vol. 1 (1905), pp. 9-11. 
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kp. Let Dig denote that complementary domain of Ji, which contains h. 
The collection of all such domains Dyz covers H; hence there exists a finite 
collection T of such domains covering H. Let T* denote the point set which 
is the sum of all the elements of 7. By Theorem 16 there exists a simple closed 
curve or triune /4, whose interior with respect to B contains that component 
of T* which contains A. Furthermore, (H+,)-J42=0. For each component 
k of K let J4, denote a simple closed curve or triune (which exists in view of 
the preceding argument) which separates A from k and is such that 
(H+K)-J4,=0. Let Da, denote the interior of J4, with respect to A. 
The collection of all such domains D4; covers K; hence there exists a finite 
collection G of such domains covering K. Let G* denote the sum of the ele- 
ments of G. By Theorem 16 there exists a simple closed curve or triune J 
whose interior with respect to A contains that component of G* which con- 
tains B. Furthermore, (H+K)-J=0. 


TueoreM 25.t If H and K are two mutually separated closed and compact 
point sets, and if neither H nor K separates the point A from the point B, then 
H+K does not separate A from B. 


Proof. Suppose H+K separates A from B. 

Case 1. Suppose H is connected. With the help of Theorem 24, it may be 
shown that there exists a domain D which contains K but no point of H and 
is the sum of a finite number of components each of which is bounded by a 
simple closed curve or triune containing no point of H+K. Let AXB denote 
an arc from A to B containing no point of K, and AYB an arc containing no 
point of H. Let Ji, J2, Jz, - - - , J, denote the boundaries of those components 
of D which have points in common with AYB. There are four possibilities: 
(1) A and B are both in S—D, (2) A and B are the same component of D, 
(3) A and B are in different components of D, (4) A or B is in S—D while 
the other is in D. In each case it may be readily shown that (AYB—AYB-D) 
+J,;4+Jo+ ---+J,+AXB-D contains an arc from A to B. 

Case 2. Suppose H has only a finite number of components, My, he, - - - , Ap. 
By Case 1, #,+K does not separate A from B. The sets 2 and 4, +K are mutu- 
ally separated ; hence by Case 1, /2+(/,+K) does not separate A from B. Con- 
tinuing this process we obtain the fact that /,+(Mni+ --- +i,+K) does 
not separate A from B. 

Case 3. Suppose H has infinitely many components. Let A XB denote an 
arc containing no point of K. By Axiom C, at most a finite number of compo- 
nents of K contain points which are contiguous to points of AXB. Let 


t Cf. P. Alexandroff, Sur les multiplicités cantoriennes et le théoréme de Phragmén-Brouwer gén- 
éralisé, Comptes Rendus de |’Académie des Sciences, Paris, vol. 183, pp. 722-724. 
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ky, ke, - - - , kn be the set of all such components of K. By Case 2 there exists 
an arc AYB which contains no point of H+4,+ --- +k,. If AVB-K=0, 
the theorem is established. Suppose A YB: K #0. For each component k of .K 
which intersects A YB, the set H+ AXB-+K is the sum of two mutually sepa- 
rated sets, one containing A and the other containing k, by Theorem 27 of 
S.C.P. By Theorem 24 there exists a simple closed curve or triune J; which 
separates A from k and is such that J,-(H+AXB+K)=0. Let J; denote 
the interior of J, with respect to A. The collection of all such domains J; 
covers the closed point set A YB-K; hence there exists a finite subcollection 
T doing so. Let 7* denote the sum of the elements of T. By Theorem 16, 
for each component / of 7* there exists a simple closed curve or triune C, 
which is a subset of the boundary of ¢ and whose interior with respect to A 
contains ¢. Let Ci, Cs, - - - , Cs be the set of all simple closed curves or triunes 
thus obtained. The point set (AYB—AYB-T*)+C,+C2+ - - - +C, contains 
an arc from A to B which contains no point of H+ K. 


THEOREM 26. If J is a simple closed curve, I is a complementary domain 
of J, H and K are two mutually separated subcontinua of J, a and B are the two 
components of J—(H+K), and C is a simple closed curve which separates H 
from K, then there exists an arc AXB such that A and B are points of a and B, 
respectively, and segment AXB is a subset of I-C. 


Proof. Suppose the theorem is false. Let w be a point of S—(J+I+C), 
which exists by Theorem 19. Let D denote the interior of C. One of the sets 
H and K, say H, is a subset of D. Obviously, C contains at least one point of 
each of the sets a, 8, and J. Let P, Q, Z, and W denote points of a-C, B-C, 
H, and K, respectively. Let a’ and 8’ denote continua which are subsets of a 
and £8, respectively, and which contain a-C and £-C, respectively. Now, 
a’+6’+I-C is the sum of two mutually separated continua M, and M; con- 
taining a’ and ’, respectively. Let E denote the last point of the arc PZQ, 
in the order P to Q, which is either a point of M, or contiguous to a point of 
M,; and let F denote the first point of PZQ which is either a point of M, or 
contiguous to a point of M2. It may be readily seen with the help of Theorem 4 
that if E and F are distinct, then E precedes F in the order P to 0. Suppose 
first that E and F are distinct non-contiguous points. The interval EF of 
PZOQ contains at least one point of H. Furthermore, there exists an arc OL 
where L is a point of segment EF and OL — Lis a subset of I—I-C. By Theo- 
rem 12 there exists a simple closed curve J’ which is a subset of J+C and 
contains EF, and whose interior I’ is a subset of J-D and contains OL —L. 
Now J’ contains no point of K, for otherwise H +I’+K would be a connected 
set containing no point of C. Thus J’ contains an arc E’F’, where E’ and F’ 
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are points of segments LEW and LFW, respectively, and the segment E’F’ 
is a subset of J-C. Thus segment E’F’ is a subset of M, and also a subset 
of M;. From this contradiction, it follows that E and F are identical or con- 
tiguous. Also (E+F)-(M,+M,)=0. Similarly, if T denotes the last point 
of PWOQ which is either a point of M, or contiguous to a point of M,, and if V 
denotes the first point of PWQ which is either a point of M2 or contiguous to 
a point of M2, then T and V are identical or contiguous. Let C’ denote a 
simple curve which separates M, from M,. Let P’Q’ denote an arc such that 
P’ and Q’ are points of M, and Mz, respectively, and segment P’Q’ is a sub- 
set of J/+7—(J+/)-(C+E+F+T7+V). Let G denote a point of segment 
P’Q’ which belongs to C’, and let NY denote an arc which is a subset of C’ 
and contains G, where N and FY are points of J and segment NY is a subset 
of J. Now N and Y do not lie in different components of J—(a’+’), for 
otherwise, 7+K-+NY would be a connected set containing no point of C. 
Suppose both N and YF are points of that component of J —(a’+’) which 
contains H. If either N or Y preceded E in the order PZQ, we would have 
two arcs (one a subset of C-M,+E, the other a subset of C-M.+ interval 
Q0’G of arc P’Q’+interval NG of arc NY) satisfying the hypothesis of Theo- 
rem 4 but not the conclusion of this theorem. Similarly, it may be shown that 
neither NV nor Y follows F. Hence N is identical with Y or contiguous to Y. 
Again a contradiction is reached, and the theorem is established. 


THEOREM 27. If A, X, B, and Y are points of the simple closed curve J in the 
order indicated, if I is the interior of J, and if H and K are mutually separated 
closed and compact subsets of J+I such that H:AXB=0 and K-AYB=0, then 
there exists an arc AB lying in J+I—(H+K). 


Proof. Two cases will be considered. 

Case 1. Suppose there exists no component 7 of H+K such that 
T+J—(A+B) is connected. Let NV; and N: denote continua which are sub- 
sets of segments AYB and AXB, respectively, and which contain H-AYB 
and K-AXB, respectively; and let a and 6 denote the components of 
J—(N,+N2) which contain A and B, respectively. By Theorem 27 of 
S.C.P., Nit+N2+H+K is the sum of two mutually separated closed sets 
M, and M, containing N,; and N2, respectively. Hence by Theorem 24 there 
exists a simple closed curve C which separates N, from N, and contains no 
point of M,+M:. Thus C contains no point of H+K. By Theorem 26 there 
exists an arc A’X’B’, where A’ and B’ are points of a and 8, respectively; 
and segment A’X’B’ lies in I-C. Now A’X’B’+a+6 contains an arc AB 
lying in J+J—(H+K). 

Case 2. Suppose that for at least one component T of H+K, T+J 


j | 


1938] TWO-DIMENSIONAL SPACES 273 


—(A+B) is connected. With the help of Axiom C, it may be shown that 
there is only a finite number of such components. Either H or K, say H, 
contains at least one such component. There exists a point Y; which is the 
first point of segment AYB, in order A to B, which is either a point of, or 
contiguous to a point of, such a component of H. 

First suppose there exists at least one point Q of AY,—A having the prop- 
erty that there is a component Tg of K such that 0+7 9+segment AXB is 
connected. Let Q:, Qe, - - - , Q, denote all such points in the order A to V,. 
Obviously Q;, (i=1, 2,---,m), is not a point of H+K. If the segment AQ, 
of AY, contains no point of H, let a; denote the interval AQ;. If segment 
AQ, contains a point of H, it may be shown with the help of Theorem 18 and 
Case 1 of this theorem that there exists an arc a, from A to Q,, lying in 
J+I—-(H+K). Similarly, if there exists an arc a;, (i=2, 3,---, m), 
from Q;_; to Q;, lying in J+] —(H+K). The set a:+a2+ - -- +a, contains 
an arc a from A to Q,. Now let Xz denote the last point of segment A XB, in 
order A to B, for which there exists a component 72 of H, such that 
Y,+72+X¢2 is connected. If 0,=Y,, and if there exists a point P of AXB 
between X, and B for which there is a component 7’ of K such that 
Y,+T’+P is connected, then, denoting Q, by Q’, we have an arc AQ’=a 
having the following properties: (1) The arc AQ’ lies in J/+J—(H+K), (2) Q’ 
is a point of segment AYB and does not precede V;, (3) there is at least one 
component T of H for which [7 +segment A XB-+ interval AQ’ (of AVB) —A|] 
is connected, but T+Q’B—(’ is not connected, and (4) there exists no com- 
ponent W of H+K such that W+AYB-—(’ is connected. If Q0,= V1, and if 
there exists no such point P, let X; denote the first point of AXB, such that 
Y,+72+X; is connected; or if 0. V1, let X, denote the first point of AXB 
for which there exists a component T of H such that Y¥1+7-+X; is connected. 
In either case, X; is a point of segment A XB, and with the help of Theorem 18, 
Case 1, it may be shown that there exists an arc 6 from Q, to X, lying in 
J+I—-(H+K). If X:4%X2, then by means of an argument like that used in 
obtaining a, it may be shown that there exists an arc y, from X; to Xe, lying 
in J+] —(H+K). If X:= Xo, let y denote X;. If there exists no point X’ of 
AXB between X, and B for which there is a component T of K such that 
segment AYB+T7+4X’ is connected, then the argument for obtaining a may 
be used here to obtain an arc 6, from X; to B, lying in J+J —(H+K). In this 
case, a+8+7-+6 contains an arc from A to B, and the theorem is established. 
If there exists a point X’, as described, let Q’ denote the first point of seg- 
ment AYB for which exists a component T of K, such that Q’+7+segment 
X2B is connected. It follows from Theorem 18 and Case 1, that there exists an 
arc n, from X, to Q’, lying in J/+7—(H+K). In this case, a+8+y+7 con- 
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tains an arc AQ’ having the four properties listed above. 

Now suppose there is no point Q as described above. Let X2 have the 
same meaning as above, and let y denote an arc from A to X2 lying in 
J+I—(H+K). Again, if there exists no point X’, as described above, let 6 
denote an arc from X; to B lying in J+J—(H+K). In this case y+6 con- 
tains an arc from A to B, and the theorem is established. If there exists a 
point X’, as described above, let Q’ and 7 be defined as before. Now y+7 
contains an arc AQ’ with properties listed above. 

Thus, in any case either there exists an arc AB lying in J+J—(H+K), 
or else there exists an arc AQ’ having the properties listed above. In the latter 
case, if there exists no point Y’ of AYB, between Q’ and B, for which there 
exists a component 7 of H such that Y’+7-+segment AXB is connected, 
then the argument for obtaining a may be used to obtain an arc Q’B lying 
in J+I—(H+K). In such a case, AQ’+0Q’B contains an arc from A to B, 
and the theorem is established. If there exists a point Y’, as described above, 
let VY. denote the first such point in the order Q’ to B. Now the argument 
for obtaining AQ’ may be repeated to obtain either an arc Q’B lying in 
J+I—(H+K) (in which case AQ’+Q’B contains an arc from A to B, and 
the theorem is established), or an arc Q’Q”’ of such nature that AQ’+0’Q” 
contains an arc AQ’’ having the following properties: (1) AQ”’ lies in 
J+I—(H+K), (2) Q” is a point of segment AYB and does not precede V2, 
(3) there are at least two components of H such that if T denotes either of 
them, then [7+segment A XB-+ interval AQ”’ of AY B—A | is connected, but 
[T+0’’B—Q’’| is not connected, and (4) there exists no component W of 
H+K, such that W+AYB—Q" is connected. Since there is only a finite num- 
ber of components of H such that if T denotes any one of them, T7+J —(A+B) 
is connected, the above process may be repeated until we have obtained either 
an arc AB lying in J+J —(H+4), or an arc AQ™ lying in J+J—(H+K), 
where ( is a point of segment A YB, and there is no point Y’ between 0‘ 
and B for which there is a component T of H such that Y’+7+segment 
AXB is connected. In the latter case, the argument for obtaining a may be 
used to obtain an arc QB lying in J+] —(H+K). The set AQ+QB 
contains an arc AB lying in J+J —(H+K); and the theorem is established. 


THEOREM 28.* Jf the common part of the closed and compact point sets H 
and K is a continuum, and if neither H nor K separates the point A from the 
point B, and if furthermore, H-—H-K and K—H.-K are mutually separated 
sets, then H+K does not separate A from B. 

* Cf. S. Janiszewski, Sur les coupures du plan faites par des continus, Prace Matematyczno- 


Fizyczne, vol. 26, 1913. Also Anna M. Mullikin, Certain theorems relating to plane connected point sets, 
these Transactions, vol. 24 (1922), pp. 144-162. 


1938] TWO-DIMENSIONAL SPACES 275 


Proof. Suppose on the contrary that H+K separates A from B. Let 
M=H+K and T=H.-K. Let S; and Sz denote the components of S—M 
which contain A and B, respectively. There exist arcs AXB and AYB such 
that AXB-H=0 and AYB-K=0. Let X, and Y, denote the first points of 
AXB and AYB, respectively, which belong to the boundary of S:. Thus X; 
and Y, belong to K—T and H—T, respectively, and hence are not contigu- 
ous to each other. Let AX, and AY, denote intervals of AXB and AYB, 
respectively. The set AX,+A contains an arc such that =0 
and A,Y,:K=0. If X; is contiguous to any point of S2, let X_ denote such a 
point, and let X,X2 denote the arc consisting of these two points. Otherwise, 
let R denote a connected domain containing X; but containing no point or 
boundary point (other than X;) of H+A,¥,, let X2 denote a point of R-Ss, 
and let X,X_ denote an arc lying in R. If Y; is contiguous to any point of So, 
let Y, denote such a point, and let Y, V2 denote the arc consisting of these two 
points. Otherwise, let W denote a connected domain containing Y; but con- 
taining no point or boundary point (other than Y;) of K+A1Xi+X1X2, let V2 
denote a point of W-S2, and let Y; Y2 denote an arc lying in W. Let X2Y2 de- 
note either the point X, or an arc lying in S; according as X- is or is not V. It 
may be readily shown that the set X¥,4;V¥;+X,X2+Vi¥2+X2F2 contains a 
simple closed curve J such that (1) J contains A; and a point B, of X2V2, 
and (2) of the two segments of J from A, to B,, one contains no point of 
H and the other contains no point of K. Let J denote that complemen- 
tary domain of J which does not contain T, and let Hi=H-(J+J) and 
K,=K-(J+1). It follows from Theorem 27 that there exists an arc A,B, 
lying in J/+7—(H,+K,). Thus A,B, contains no point of M. But this is 
impossible since A, and B, lie in different complementary domains of M. The 
theorem is thus established. 


THEOREM 29. If no point of the arc XY separates the point A from the 
point B, then XY does not separate A from B. 


Proof. Suppose S—XY=S,4+5S,, where S, and Sz are mutually sepa- 
rated sets containing A and B, respectively. Let S; denote the set consisting 
of X together with all points Z, if there are any, such that the interval XZ 
does not separate A from B, and let Ss=XY—S,. Now S: contains Y, and 
clearly every point of S; precedes every point of S:. Hence there exists either 
a last point of S; or a first point of So. 

Suppose there exist a point O which is the last point of S, and Q which 
is the first point of S2. By Theorem 20 the interval OQ does not separate A 
from B. Hence by Theorem 28, 5:+OQ does not separate A from B. Thus 
XQ does not separate A from B, contrary to the fact that Q is a point of S2. 


\ 
q 
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Suppose that there exists a point O which is the last point of S,. Let 
O;, Oz, O3,--~- be a sequence of points of S: converging to O such that 
each precedes the next in the order Y to X. By Theorem 28, the interval 
OO,, separates A from B. Thus, by Theorem5, Chapter 2,o0f P.S.T., the point O 
separates A from B. 

Suppose there exists a point O which is the first point of S:. Let 
O;, O2, Oz, -- + be a sequence of points of S; converging to O such that each 
‘precedes the next in the order X to Y. By Theorem 28, the interval O,0 
separates A from B. Hence by Theorem 5, Chapter 2, of P.S.T., the point O 
separates A from B. Again a contradiction is reached and the theorem is es- 
tablished. 

The following proposition is false: 


Proposition. If A, B, and C are three distinct points, and if A is not con- 
tiguous to either B or C, then there exists a simple closed curve or triune which 
separates A from B+C. 


In Example 1, let A be any point of the triune 77,7273, and let B and C 
be points of different complementary domains of this triune. There does not 
exist a simple closed curve or triune which separates A from B+C. 
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ON INTEGRATION IN VECTOR SPACES* 


BY 
B. J. PETTIS 


Introduction. Several authors ({3]—[10], inclusive; [15])t have already 
given generalized Lebesgue integrals for functions x(s) whose values lie in a 
Banach space (B-space) X.{ In the following pages another definition,§ based 
on the linear functionals over ¥ and on the ordinary Lebesgue integral, || will 
be given and its properties and relationships to other integrals discussed. 

We consider functions x(s) defined to a B-space ¥ = [x] from an abstract 
space S = [s] possessing both a o-field 2 of “measurable” sets having S as an 
element and a non-negative bounded c.a. (completely additive) “measure” 
function a(E) defined over . Notational conventions are as follows: X 
denotes the B-space conjugate to ¥ ([1], p. 188), f=f(x), g=g(x),--- de- 
note elements of ¥, and F=F(f), G=G(f), - - - elements of %, the conjugate 
space of ¥; and for real-valued functions Greek letters will be used. When 
the abstract functions x(s), y(s), - - - , or the real functions ¢(s), Y(s), - - - 
are considered as elements of a functional space, they will sometimes be 
written x(:), y(:) or @(:), ¥(:). The “zero” element of ¥ will be denoted by 80. 

The contents of the paper may be outlined as follows. The first section 
compares measurability of functions under the strong and weak topologies 
of ¥. The second defines the (X) integral and utilizes an essential lemma, first 
stated by Orlicz, to prove the complete additivity and absolute continuity 
of the integral. The linear operations from L” to ¥ and from X to L” defined 
by integrable functions are investigated in §3, and these results are expressed 
in terms of real-valued kernels in §7. Approximation and convergence theo- 
rems occupy §4 and lead to §5 and the relationships between the (X) integral 
and the integrals given by other definitions. A few remarks on completely 
continuous transformations and on differentiation account for §§6 and 8, re- 
spectively, and four examples form §9. In conclusion a few open questions are 
cited. 

* Presented to the Society, March 27, 1937; received by the editors August 28, 1937. 

+ Numerals in brackets refer to the list of references at the end of the paper. 

t See [1]. The terminology used will be that of Banach’s book. 

§ Suggested to the author by E. J. McShane. The definition is also due to Dunford, Uniformity 
in linear spaces, these Transactions, vol. 44 (1938), pp. 305-356. Also see [10], p. 50, footnote. 

The author wishes to take this opportunity to express his gratitude to Professor McShane for 
for his most kind and helpful criticism during the writing of this paper. 

|| From the definition it will be evident that any integral for real-valued functions can be gen- 
eralized in the same fashion. 
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For reference purposes the following well known theorems concerning B- 
spaces will be listed:* 


A. Given a linear functional (an additive, continuous, real-valued function) 
g(y) defined in a linear subset 9) of %, there exists a linear functional f(x) defined 
in % and such that f(y) =g(y), for y in Y, and ||f|| =||g|| where the norm of f is 
taken over X and that of g over ¥. 


B. For every xo in & there is a linear functional fo(x) defined over ¥ and such 
that | fo(xo)| =||x0|, and || fol] =1. 

C. Given a sequence {x,} of elements of %, the closed linear hull (the closure 
of the set of all finite linear combinations) of the x,’s is a separable sub-B-space 
of %. 

i. Measurable and weakly measurable functions. A function x(s) from S 
to ¥ will be called a step-function if and only if «(s) is constant on each of a 
finite number of disjoint measurable sets whose sum is S. As in Bochner [4], 
x(s) is called measurable provided it is a.e. (almost everywhere) the strong 
limit of a sequence of step-functions; a measurable function taking only a 
countable number of distinct values will be said to be countably-valued. If 
there exists a separable sub-B-space 9) and a set S’ such that a(S’) =a(S) 
and s in S’ implies x(s) in 9), then x(s) is separably-valued. 

It is evident that a measurable function x(s) is also weakly measurable, 
that is, f(x(s)) is measurable ({2], p. 251) for every f in ¥. The relationship 
between measurability and weak measurability is given in the following theo- 
rem: 

THEOREM 1.1. A necessary and sufficient condition that x(s) be measurable 
is that it be weakly measurable and separably-valued. 


If x(s) is the limit a.e. of step-functions x,(s), then almost all of its values 
lie in the separable closed linear hull (see C) of the denumerable set of values 
of the functions x,(s); thus x(s) is separably-valued. 

Now suppose x(s) is weakly measurable and, with no loss of generality, 
that all the values of x(s) lie in a separable subspace 9). 

The first conclusion is that ||.(s)|| is measurable. Let {g;} be weakly dense 
({1], p. 123) in the surface S of the unit sphere of 9). If A =S|||x(s)|| <a], and 
A,=S{|g(x(s))| <a] for each g in S, then A =]]A,, where the product is 
taken over g ¢ S. For if ||x(s)|] <a, then for any g in S we have ||g|| =1 and 
hence | g(x(s))| <||«(s)|| <a; and for the fixed element «(s) of 9) there is, by B, a 
z in S such that ||x(s)|| =| z(«(s))|. Thus A =] ]A, ¢]];A¢,, where the first 


* The first two are Theorems 2 and 3, [1], p. 55. 


if 
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product is taken over g e S. The sequence {g;} being weakly dense, for any g 
in S there is a subsequence {g/} such that g/(x)—g(x) for every x in X; 
thus if s is in []A,,, then | g(x(s))| =lim,| g/ (x(s))| <a for each g in S, so 
that A =]]A,=]],A,,, where the first product is taken over g ¢« S. From A 
and the hypotheses on x(s), each set A,,; is measurable, which implies measur- 
ability for ||.«(s)]]. 

If each x(s) is covered with an open 1/n-sphere in separable 9), by Linde- 
l6f’s theorem a countable set (¢=1, 2,---,7,---), of these spheres 
contains the set of functional values of x(s). If xi, is the center of 9in, then 
«(s) —«in is weakly measurable and has all its values in 9), so that ||c(s) —xinl| 
is measurable. Hence the set E;, composed of all s having x(s) in 9);, is a meas- 
urable set, and ;Ei;,=S. If xn(s)=2in for s in EX=Ejn—) Ein, then 
||x(s) —an(s)|| <1/m for all s, since }>,E* =S. Since the sets E;, and Ex are 
measurable, x,(s) is constant on each of a countable number of disjoint meas- 
urable sets whose sum is S; x,(s) is therefore measurable, hence x(s), the uni- 
form limit of {x,(s)}, is also measurable. 


CoroLiary 1.11. If X is separable, measurability is equivalent to weak 
measurability. 


CorOLiaRy 1.12. A function x(s) is measurable if and only if it can be 
approximated uniformly, except possibly on a set of measure zero, by countably- 
valued functions. 


Corottary 1.13. If {x(s)} is a sequence of measurable functions converg- 
ing weakly a.e. to x(s), then x(s) is measurable. 


For if 9), is a separable sub-B-space containing almost all the values of 
Xn(s), let }) be the separable closed linear hull of a denumerable set dense in 
> Yn. If xn(s)— x(s) weakly at the point s, then ([1], p. 134) some sequence 
of linear combinations of the elements {x,(s) } converges strongly to «(s) and 
x(s) is therefore in 9). Since this statement holds for almost all s, and since 
«(s) is obviously weakly measurable, the above theorem gives the desired 
conclusion. 

1.14. Suppose that S is euclidean, Ry is a fixed elementary figure in S, 
and a@ is the Lebesgue measure function. If X(R) is defined to ¥ from the ele- 
mentary figures in Ro, then X(R) is weakly differentiable at a point s in Ro 
if there exists an element x(s) in ¥ such that for every f in ¥ 


= tim 


a0” \ 


where J is an arbitrary cube containing s. 


; 
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THEOREM 1.2. Jf X(R) is weakly differentiable a.e. in Ry to x(s), then x(s) 
is measurable. 


Let Jo be a cube containing Ro, and for each m let II, be a partition 
of J) into non-overlapping non-degenerate subcubes, with II,4:1 a repartition 
of II, and lim, .,. (norm II,,) =0. If s in Ro is interior to a cube /;, in II,, and 
if Ro>J;,, define 
X(T in) 


otherwise set x,,(s) =6. Each x,(s) is measurable, and by hypothesis x,,(s) con- 
verges weakly to x(s) for almost all s in Ro; from Corollary 1.13 this implies 
measurability for x(s). 

2. Definition and properties of the integral. We make the following defi- 
nition: 


%,(s) = 


DEFINITION 2.1. A function x(s) from S to X is (%) integrable over measura- 
ble E if and only if there exists an xz in & such that 


sve) = f 


for every f in X, wherein the integral on the right is the Lebesgue.* By definition 


J = Xz. 


A function x(s) is integrable or (X) integrable if and only if it is (X) integrable 
over E for every measurable E.+ 


2.11. Immediate consequences of the definition are: (1) the integral is 
single-valued; (2) it is a linear function of the integrand; (3) if ¢(s) is finite for 
every s and is Lebesgue integrable, and if x(s) =x», a constant, then $(s)x(s) 
is defined and integrable and /,(s)x(s)da=x9-f,6(s)da (in particular, 
J =x9-a(E)); (4) if «(s) is integrable and y(s)=x(s) a.e., then is 
integrable and /,,y(s)da=,,x(s)da for every measurable E; (5) if X is the 
space of reals, this definition coincides with the Lebesgue integral. 


THEOREM 2.2. Jf x(s) from S to & is integrable, and if y=U(x) is a linear 
operation from to then y(s)=U(x(s)) is integrable and U(J,,x(s)da) 
= ,U (x(s))da. 


* [2], p. 247; [17]. 
t This definition is analogous to that of I. Gelfand [15]. Gelfand, however, considers functions 
defined from a linear interval to %. 
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If any gin 9) is given, the identity g(y) =g(U(x)) =f(«) defines an element 
f of ¥. By hypothesis the function f(x(s)) =g(y(s)) is Lebesgue integrable 
over any measurable E, and 


f = f f 


This implies the conclusions of the theorem. 

2.21. If (J) is an integral definition for functions x(s) from S to ¥ that re- 
duces to the Lebesgue when & is the space of reals, and if the (J) integral has 
the above property of Theorem 2.2, it is clear, on letting 9) be the real num- 
bers, that every x(s) that is (J) integrable is also integrable and to the same 
value. Thus the (X) integral includes both that of Bochner ([9], p. 475) and 
that of G. Birkhoff ([6], p. 371). From this second inclusion it follows that 
an (X) integral is not always of bounded variation. However (Theorems 2.4 
and 2.5), it is completely additive and, in the sense of Saks, absolutely con- 
tinuous as a function of measurable sets. 

2.3. If }°,%, is a series of elements of X, and if every subseries of >>,2, 
is convergent, then )_,%, is said to be unconditionally convergent ({12], p. 33); 
a function X(£) from = to X is completely additive ([6], p. 365) if and only 
if for every sequence { E,} of disjoint measurable sets >_,.X(E,) is uncondi- 
tionally convergent and >_,,X(E,) =X(>.,E,). A series is said to have 
property (O) of Orlicz ([11], p. 244, Theorem 2) if and only if every subseries 
is weakly convergent to an element of %. 

The next lemma and theorem were proved by Orlicz in [11] (Theorem 2) 
for the case X weakly complete; the general theorem is credited by Banach 
({1], p. 240) to the same author without proof or reference. Since we know no 
published proof to which to refer, and since the lemma is fundamental for 
our purposes, we include the following demonstration.* 


Lemma 2.31. If > Xn has property (O), then given e>0 there exists an N, 
such that f in and ||f|| =1 implies y,|f(an)| hence ||-«,|| 0. 


Let 9) be the separable closed linear hull of the x,’s. To prove the first 
part it is sufficient to show that if {g;} is a weakly convergent sequence of 
functionals over 9) converging weakly to go, then in the space /' of absolutely 
convergent series the norms of the elements \;= { g:(x») —go(xn) } converge to 
0. For if there exist an €9>0, a sequence {g;} in 9), and integers {N;} such 


or 


* This lemma has been generalized by Dunford, Uniformity in linear spaces, loc. cit. 
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that <1, and and if 
(1) Dd | gilxn) | = eo > 0, for all i, 
n=N, 


then since 9) is separable and ||g,|| <1, we may suppose ([1], p. 123, Theorem 
3) that {g,;} is a weakly convergent sequence of functionals over 9), and hence 
that there is a go such that {g;} converges weakly to go. But >-x,, has property 
(O), and for i=0, 1, 2, - - - , the functional g; is extensible from 9) to ¥; thus 
gilan)| and if 2-0, the inequality (1) is contradicted and the 
first part of the lemma established. 

To prove that ||d,||--0 it is necessary and suffigient that to show ([1], 
pp. 138-139) d(A,;)--0 for every linear functional d(A) defined over /! for which 
the associated bounded sequence {d,} is composed only of 0’s, +1’s, and 
—1’s. If d(A) is such a functional, then by property (O) 


dn=+1 dn=—1 
2) = f(xt) — f(x-) 
= f(x0) 


for every f in ¥. Then the sequence {>-?_,d:x;} converges weakly to 20, so 
that x» must be the strong limit of certain linear combinations of the elements 
dix}; hence 2» is in 

From A and the fact that x» is in 9) it follows that (2) must hold for any g 
in 9). This implies 


= dn [gil%n) — go(%n)] 
n=1 
= gi(%o0) — go(xo) 
for each i. But gi(x)—>go(x) for every x in 9) including x»; hence d(A;)—> 0. 
The fact that ||,||—+ 0 follows immediately on applying B to the result 
just established. 


THEOREM 2.32. Unconditional convergence is equivalent to property (QO). 


Unconditional convergence implies property (O), since convergence to an 
element implies weak convergence to that element. 

If >-,x, is not unconditionally convergent, there is a non-convergent sub- 
series. By inserting parentheses this subseries can be so “grouped” that the 
resulting series mm has lim sup ||yn|| >0. If }>,2, has property (O), so does 
every subseries and every finite grouping of a subseries; then by the lemma 
||¥m|| 0, and we have a contradiction. 


| 
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THEOREM 2.4. If x(s) is integrable, then 
X(E) = f x(s)da 
E 


is a completely additive function of measurable sets. 


Let {£, } be disjoint measurable sets. From the integrability of x(s) and 
the complete additivity of the Lebesgue integral we have 


= fsa) = DED), 


where E’=)_,.E,’ and f in % is arbitrary. If any sequence {£E,} of disjoint 
measurable sets is given, this must hold for every subsequence { E,’ } ; hence 
>..X(E,) has property (O) and is therefore unconditionally convergent by 
Theorem 2.32. 


THEOREM 2.5.* If x(s) is integrable, then 
X(E) = f x(s)da 
E 


is absolutely continuous. 


If not there exist an «>0 and sequences {£;}, {f;} such that a(EZ;)—0, 


=1, and 
v(s)da) -| f 


f fil x(s))da 
‘ 


Let $(s) =Lu.b.; | fi(x(s))| S|lx(s)||, and let M,=S[n—1<¢(s) <n] for each 
integer n >0. The sets {M,} are disjoint and measurable, )>,M,=S, and s 
in M, implies | f;(x(s))| <m for all i. 

Since E,=>>,E,M, and X(E) is completely additive, 


f x(s)da 


there must then be an integer m, >) =0 such that 


| 


n=ng+1 E,-Mn 


> «(s)da 


ex | 
n=1 E,-Mn 


€ 
> 
2 


* The method of proof here was suggested by E. J. McShane. 


| 
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Suppose there exist sets E;,, - - 


< +--+ <m, such that 


forj=1,2,---, 


Then 


n=nj—1+1 


k. Since a(E;)—0 there is an i,4; such that a(E;,,,) 
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> x(s)da 


| =| __ 
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-, E;, from {E;} and integers 


n=1 Ein. Mn Mn 
ny a( Ei, ,,) + | > Sin ss(4(s))da | 
| Bing, Mn 
<i+| |; 
hence there clearly exists an 1,4; >m, such that 
f | > 
n=nptl 
Then, if one sets 
n=np+l 
it follows that 


This completes the induction; thus there exist sequences {n,}, {i.} of in- 
tegers such that and if G,,: is defined as above, for k=0,1,2,---, 


tk+1 


But the sets G, are disjoint and measurable, so that >-,_,fe,x(s)da 
=f zo,*(s)da, whence || a contradiction. 


[f] of 


Coroiary 2.51. If x(s) is integrable, then for any bounded set M = 
elements of % the integrals J ,|f(x(s))| da are equi-absolutely continuous. 


| 

x(s)da | 

i 

\ 


— 


a 
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If K, we have | f,f(x(s))da| = |f(f,x(s)da)| -|| 
<K-||f (s)dal| <e for a(E£) less than a suitably chosen 6. Thus if E+ 
=E([f(x(s))=0] and E7 we have 


< 2e 


fi f(x(s)) | da = | + | 


for <6, since a(£) is non-negative and Ej +E; =E. 

3. Operations defined by integrable functions. In this section S will be 
the closed interval [0, 1] and a(Z) the Lebesgue measure function. According 
to custom L?, (1<p< ~~), will denote the class of real-valued measurable 
functions ¢(s) defined over S and having |¢(s)|” (L) integrable, and L® the 
space of the real-valued essentially bounded and measurable functions defined 
over S. It is well known that L? forms a B-space when ||¢|| = (/;|4(s)|?ds)"/” 
for 1<p<~ and ||¢|| =ess. sup.s |¢(s)| for p= ©. There is no loss of gen- 
erality in supposing that every element ¢(:) of L” has no infinite functional 
values. 


DEFINITION 3.1. [f 1S p’S ~, then Q(X) is, for a given X, the class of all 
functions x(s) from S to ¥ having f(x(:)) in L”’ for every f in X;* if p’ = ~, the 
added condition 

l.u.b. (ess.sup. | f(x(s))|) < © 

must be satisfied by each x(:) in 2°(%); this l.u.b. is denoted by ((x))... The class 
Qo?’ (X) is defined to be the subclass of 2” (X) composed of integrable functions. 


3.11. The class %*(¥) always includes the weakly measurable and essen- 
tially bounded functions, and ((x))..<ess. sup.s ||«(s)|]. A partial converse is 
that if the separably valued function x(:) is in 2°(%), then x(s) is not only 
measurable (by Theorem 1.1) but also essentially bounded. Suppose x(s) is 
in 9), a separable subspace, for almost all s, and that x(:) is in &°(X). Let {g;} 
be weakly dense in the surface of the unit sphere of 9), and let A be the set of 
valuesof ssuch that ||(s)|| > ((x)).. and A; the set such that | g,(x(s))| >((x)).. 
If A’=S—A, A/ =S—A;, then by the proof of Theorem 1.1 we have 
A'=]J,A/ and hence A =)>A;. Since a(A;) =0, this implies that x(s) is es- 
sentially bounded and that ess. sup.s ||(s)|| =((x))..- 

Thus if ¥ is separable, 2*(X) is the class S..(¥) ([9], p. 474) composed of 
essentially bounded and measurable functions, that is, essentially bounded 
Bochner integrable functions. Since 2*(%) > o°(X) > S..(¥), we have, for sepa- 
rable %, 


* This is due to Dunford (cf. [8]) and constitutes a direct generalization of the (X) integral. 


| 
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S.(¥) = = W(X). 
THEOREM 3.2. If x(:) is in (1S p’ the operation 
U(f) = f(*(:)) 


from % to L*’ is linear, and 


1 1/p’ 
\|U|| = (f | f(«(s)) |” is) 1s 
= pl =o, 


The additivity of U is trivial. In case p’ = « 


|| || = ess-sup. | f(a(s))| 


and U is linear. If p’< ~ and f, fo in %, then f,(x(s)) > fo(«(s)) for every 
s, and by Fatou’s lemma 


tim int f | faCx(s)) "as | 


)||, and U is again linear. The remainder of the 
theorem follows from-the fact that || V|| =1.u.b.1y1-1 || V(y)|| for an arbitrary 
linear operation V(y). 

DEFINITION 3.21. Jn the linear space ’(X) the norm of x(:) is ((x))» 
=||Ul. 

3.22. In the linear normed subspace &o'() a sequence {x,,} is a Cauchy se- 
quence if and only if l.u.b.x || {,.(%m—2,)ds||—> 0.* For 


l.u.b. (4m — x,)ds | = l.u.b. l.u.b. feats) — xn(s))ds 
< l.u.b. Kem = | d 
S (zn 


and, on the other hand, if l.u.b.z I| J,,(%m —2n)ds| <e, then for || f|| =1 


f — x,)| ds = — x,)ds +| — x,)ds 


— X,)ds _ (%m — %n)ds || < 2e. 


* See [6], p. 368, where this l.u.b. is used to norm the Birkhoff sada functions. 


re 
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Thus in &, (¥) these two definitions of norm are topologically equivalent; 
they are not, however, identical. 

In passing we note the obvious facts that if x(:) is in 2°’(¥), then x(:) is 
in &*’(X) for 1 <q’ <p’, and that 


lim ((4%m— Xn))p' = O— lim ((4m — %n))q: = O— lim ((4m — Xn))1 = 0 


— lim l.u.b. | (Xm — Xn) || = 0 
mn E E 

— lim (%m — Xn) || = 0 
mn E 


for every measurable E, the arrows meaning “implies that.” 

In the remainder of this section we shall consider pairs p, p’ of “conjugate 
exponents,” that is, numbers ~, p’ connected by the relations 1/p+1/p’ =1 
for1<p< (or1<p’<~o), for p=1, and p’=1 for p= ~, and shall 
consider the operations from L? to ¥ defined by elements of %o?’(¥). Given L? 
we shall deal only with 2*’(¥), where p and p’ are conjugate, and given 22(%), 
only with L:’, g and q’ conjugate. 


THEOREM 3.3. If x(:) is in 2°'(%), then for a fixed o(:) in L? the integral 
(1) Fa =f 


defines a linear functional F, over ¥,t and the operation 
(2) F, = U*(¢) 


so defined from L? to & is linear, with ||U*|| = ((x)) ». 


_ Fora fixed x(:) in 2*’(%), (1<p’< the operation U(f) =f(«(:)) from 
X to L»’ is linear by Theorem 3.2. Then if g, is the linear functional over L?’ 
generated by the element ¢(:) of L”, it follows that 


(3) = f f(x(s))6(s)ds = Fo(f) 


defines a linear functional F,(f) over %, that is, an element F, of %. If p<, 
so that g is a linear functional over L”’ only if it is so generated, the operation 


Fy = U*(9) 


t If we take ?=2 in example 4 of [6], then ((x));=2-"? while Lub. p|| 
t See [8]. 


mn mn 

| 
: 
| | 
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thus defined from L”’ = L? to X is by definition the operation U adjoint to U; 
hence U* is linear and ||U*|| =||U/|| =|| U|| =((x)) »-. 

If p’ = «, then U is from L* to ¥, where L® is not an L? class ([13], p. 875; 
[16]). But L includes L' in the sense that any ¢$(:) in L' defines a linear func- 
tional 


$'(s)4(s)ds, in L*, 


over L*, where ||g,l| is f,|(s)|ds, the norm of _¢(:) as an element of L'. 
Thus U*(¢) maps this imbedded subspace onto %, and by (2) and equation 
(3) it is evident that U* coincides with U where U* is defined. Hence U* is 
linear and || U*|| < ((x)).. If || U*|| <((x)).., then there is an f with ||/|| =1 and 
||U*|| < ess. sup.s |f(x(s))|, that is, |] U*||<|f(x(s))| on a set of measure 
greater than zero. Hence there is an fy (either +f or —f) and a set Sp such that 
fol] =1, a(So)>0, and fo(x(s)) >||U*|| [0 on So. Letting ¢o=1/a(So) times 
the characteristic function of S», we have 


|| U*|| = || = |] U*(o)|] = oll = 


ds = ||U*l, 


1 | 
= x(s))-——— ds| > U* 
which is impossible. Thus || U*|| = U|| =|| UI. 
THEOREM 3.4. In order that x(:) be in Qo?’ (X), it is necessary and sufficient 
that x(:)@(:) be integrable for every o in L”. If x(:) is in Lo?’ (X), then the opera- 
tion 


(4) V(¢) x(s)(s)ds 


from L? to %& is linear, with || V|! =((x)) ». Moreover, the operation V(f) adjoint 
to V is from X to L”’ and assigns to each f in & the linear functional over L” 
generated by the element U(f) =f(x(:)) of L*’. 

Suppose that x(:) is in 2o”’(%). If #(s) is a step-function, the integrability 
of x(s)¢(s) follows from the linearity and set-additivity of the integral. If ¢(:) 
in L” is arbitrary, let {¢,} be step-functions converging to ¢ in L». For any 
measurable E and any m, n there exists, by B, an fmn in ¥ such that || fnn|| =1 


and 
| J - 00 | J 


on applying Hdélder’s inequality, we obtain 


— 
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| fonds f xo,ds 
E E 


= ji mn((Gm Gn) x) | ds 


((x)) 2? — 0. 
If xz=lim, J,~(s)¢,(s)ds, then for any f in % it follows that 


tim = tim ff 


f = f 
E E 


f(xz) 


and x(s)-¢(s) is integrable. 

If x(s) is integrable for every ¢ in L”, then f(x(s))¢(s) is summable for 
every ¢; hence f(x(:)) is in L?’, that is, x(:) is in Q*’(¥). The integrability of 
x(s) results from letting ¢ be the characteristic function of measurable E. 

If xo in & is fixed, the identity 


(5) Fo(f) = f(x0), for all f in¥ 
defines a linear functional F, over %. If one sets 
Fy = W(x), 


the operation W is linear and norm-preserving from ¥ to ¥ and is 1-1 from ¥ 
to its contradomain, the subspace ¥’ of ¥ composed of all elements having a 
representation of the form (5) ((1], p. 189, proof of Theorem 13). Since W 
is 1-1 and norm-preserving, and since X is 2 B-space, X’ is also a B-space and 


x = W-'(F) 


from X’ to X is linear and norm-preserving. 
For ¢ in L? the integrability of x(s)¢(s) implies 


and F, is in X’, with 


= x(s)o(sas 
or 
W-(U*()) = Vie). 


The operation V(¢) must then be linear, and, since W-" is norm-preserving, 


| | 
| = - 
| 
| 
| 
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To complete the proof suppose f in %, and let f’ be the linear functional 
over L? defined by f’(¢)=f(V(¢)); that is, let f’=V(f) where V is adjoint 
to V. Then 


holds for every ¢ in L”, where f(x(:)) is in L”’; thus f’ is the linear functional 
over L” generated by the element f(x(:)) =U(f) of L*’. 

Corotiary 3.41. If x(s) is integrable and $(s) is a measurable and essen- 
tially bounded real-valued function, then x(s)@(s) is integrable. 

4. Convergence and approximation of integrable functions. We begin with 
the following theorem: 

THEOREM 4.1. If {x,(s)} is @ sequence of integrable functions converging 
weakly in measure to x(s), and if lim, J ,«,(s)da exists for every measurable E, 
then x(s) is integrable and 

Since f(x,(s))—f(x(s)) in measure and since also f(xz) =lim, f(J,%n(s)da) 
=lim, f pd (Xn(s))dex, from real-function theory it follows that f(«(s)) is in- 
tegrable and that 


= im J = 


for any f in ¥ and any measurable EZ. Thus x(s) is integrable and f pr(s)da 
J,n(s)da. 

TueoreM 4.2. Jf {x,} is a Cauchy sequence in &P(X) and x,(s)—>x(s) 
weakly in measure, then x(:) is in 2? (¥) and ((x—4n)) p— 0. 


Since {x,} is a Cauchy sequence, 


it follows for any f, on considering f, =//||f||, that the functions f(x,(:)) con- 
verge as elements in L”. Since f(x,(s))—>f(x(s)) in measure, the conclusion is 
that f(x(:)) isin L”, that is, x(:) is in 2#(X). On the other hand, the hypotheses 
here imply those of Theorem 4.1, so that x(:) is in & (%). 
Now by hypothesis, given an e <0 there exists an V, such that if m,n2N,, 
then 


l.u.b. (| — 22) < 
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thus for any fixed m =N, and for any f with ||f|| =1 we obtain 

on letting n— © and recalling that f(x,(:))—>f(«(:)) in LZ? and that in L? the 
norm is a continuous function. This last inequality holds for any ||f|| =1 and 
any m=N,; then for m2=N,, or ((Xn—x)) 0. 

THEOREM 4.3. The function x(s) is integrable and measurable if and only if 
there exist step-functions {-y,(s)} such that { y,(s)}—x(s) a.e. and ((yn—x))1— 0. 

That the existence of such a sequence implies x(s) measurable and in- 
tegrable is evident from Theorem 4.2. In the proof of the converse two 
lemmas will be used. 

Lemma 4.31. The theorem is true for integrable countably-valued functions. 

Let x(s) have the constant value x; on E;, the sets { E;} being countable, 


measurable, and disjoint, and let S=)/E;. Since }),f,x(s)da converges un- 
conditionally, there exists (Theorem 2.32 and Lemma 2.31) for each 1/2‘an N; 


such that ||/|| <1 implies 


Nj+1 Nj+1 


< 1/2/, 


with VNi<N2< - - - . By defining y,(s) =x; on fori <N, and y,(s) else- 
where, one obtains {y,(s)}—>x(s) everywhere and 


J | f(x(s) — yn(s)) | da = , » | f(x(s)) | da < 1/2" 


Natl” E; 


for $1; hence ((r—y,)): 0. 


Lema 4.32. If x(s) is measurable and integrable, it can be uniformly ap- 
proximated, except possibly on a set of measure zero, by integrable countably- 
valued functions. 


In virtue of Corollary 1.12 there are countably-valued functions x,,(s) such 
that ||x(s) —x,(s)||<1/n uniformly for s in So, where a(S») =a(S). The meas- 
urable function x —x, is then essentially bounded, therefore Bochner integra- 
ble, and hence (X) integrable, which implies integrability for x, =x —(x—x,). 

Returning to the proof of the theorem, we suppose x(s) integrable and 
measurable and «x,(s) a countably-valued integrable function within 1/n of 
x(s) essentially uniformly (Lemma 4.32). Then for || || <1 we have 


fire — x)|da < f lx — x,||da < a(S)-1/n, 
s Ss 
so that ((~—*,))1<a(S)-1/n. 


= 
= 
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If is fixed, the integrable c.v. function x,(s) is a.e. the limit of a se- 
quence of step-functions, which therefore converge almost uniformly ([5], 
p. 442) to x,(s). Hence there exists a step-function y,(s) such that 
||x,(s) —yn(s)|| <1/n except on a set of measure at most 1/2", while (from the 
second conclusion of the first lemma) 


((an — Yn))1 < a(S)-1/n. 


Collecting inequalities, we have ||x(s) —y,(s)|| <2/n except on a set of meas- 
ure at most 1/2", and ((«—y,))1<a(S)-2/n. It follows that y,(s)—x(s) almost 
uniformly, and therefore a.e., and that ((x—y,)):— 0. 


Coro tary 4.33. If ¥ is separable, then %o'(X) is separable. 


5. The (X) integral as related to others. In addition to the Bochner (Bn), 
Birkhoff (Bk), and Gelfand extensions of the Lebesgue integral, and Dun- 
ford’s first integral ([5]; this is equivalent to Bochner’s, cf. [9], p. 475), two 
other definitions, also due to Dunford, have been given ({7], [8]). The last 
two integrals will be here denoted as the (D1) integral and the (D2) integral, 
respectively. The first, a direct generalization of the Bochner integral, is de- 
fined as follows: «(s) is (D1) integrable if and only if there exist step-functions 
«,(s) such that x,(s)—>«(s) a.e. and lim, /,x,(s)da exists for every measurable 
E. This limit is (D1) /,«(s)da. The second, as we have noted earlier, is a gen- 
eralization of the (X) integral; it requires merely that x(:) be in '(X) and de- 
fines (D2) f,x(s)da, to be the element Fz of given by Fr(f) =/,,f(x(s))da. 

In paragraph 2.21 we noted that the (X) integral includes both the (Bn) 
and the (Bk). The connection between the (D1) and (D2) integrals and the 
(¥) integral will now be considered. 


THEOREM 5.1. A function x(s) is measurable and integrable if and only if 
x(s) is (D1) integrable. The integral of x(s) has the same value for the two defini- 
tions. 


This theorem is seen to follow immediately from Theorem 4.3 and 
Theorem 4.1. 


Coro.iary 5.11. For measurable functions the (%), (Bk), and (D1) inte- 
grals are equivalent. 


In view of the theorem and the .act that (Bk) integrability implies (X) 
integrability to the same value, it is sufficient to show that a measurable 
integrable x(s) is (Bk) integrable; that is, ({6], Theorem 13), given e>0 there 
exist disjoint measurable sets { E;},i=0,1, - - -,m, - - 
the series >°/29a(E;)x(s;) is unconditionally convergent for any choice of s; in 
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E;, and s;, s/ in E; implies )|| <e. From Theo- 
rem 4.32 there exist a c.v. integrable y(s) and a set Ey such that a(E») =0 


(1) l|x(s) — ¥(s)|| < «/4a(S), for sin Sp = S — Ep, 
and 

(2) fx \da | 


Let {£;}, (G@=1, 2, - - -), be disjoint measurable sets with and 
y(s)=y; on E;. Then fs,y(s)\da=)> a(E;)y; is unconditionally convergent; 
hence, since (1) implies that for arbitrary s; in E;, (¢=0, 1, - - - ), the in- 
equality >> —y;)|| <e/4 holds, the series 


= a(E,)x(s)) = > a(E,)(x(si) — yi) + > a(E;) ys 


is unconditionally convergent, being the sum of two such series. Moreover, 
and this fact combined with (2) gives 
a(E,)x(s,) — <e/2, which completes the proof. 

A consequence of Corollary 5.11 is the failure of the Fubini theorem to 
hold for the (D1), (¥), and (D2) integrals, since ([6], example 6) the theorem 
fails for the (Bk) integral when X is the separable space L’. 


THEOREM 5.2. The integrals of two measurable and integrable functions x(s), 
y(s) coincide over every measurable set if and only if x(s) =y(s) a.e. 


If the integrals of measurable x(s) and y(s) coincide everywhere, then 
z(s) =x(s) —y(s) is separably-valued and has its integral vanishing identically. 
Let So be a subset in S such that a(S—S»)=0 and 2(So) ¢Y, where ¥ is a 
separable subspace of ¥. Letting {g;} be a sequence weakly dense in the sur- 
face of the unit sphere of 9), we find as in paragraph 3.11 that A =)_A,, where 
A =So|||2(s)|| >0] and A;=So[|gi(z(s))| >0]. Since f,2(s)da=6 for all EZ, 
it results that [,g:(z(s))da=0 in E for each i, and therefore that | g;(2(s))| =0 
a.e. in So. Thus a(A;) =0 which implies a(A) =0, so that 2(s) =6 a.e. in So 
and hence a.e. in S.* 

The remaining part of the theorem follows from (4) of paragraph 2.11. 


THEOREM 5.3. If measurable x(s) is in U(X), then x(s) is integrable if and 
only if Fy =(D2) fex(s)da is absolutely continuous. 


* Theorem 5.2 can be regarded, in the light of 5.11, as a corollary of Theorem 24 of [6]. In the 
proofs of Corollary 5.11 and Theorem 5.2 we have essentially repeated Birkhoff’s arguments for 
Theorems 22 and 24 of [6]. 
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If x(s) is in &o (¥), then 


for all f; therefore || F || =|| f,«(s)dal| where the first norm is taken over ¥ and 
the second over ¥. The absolute continuity of Fz now follows from Theo- 
rem 2.5. 

Suppose «(s) measurable and in 21(X). Let 9) be a separable subspace con- 
taining almost all of the values x(s); then x(:) is in 2'(), since any linear 
functional over 9) is extensible to ¥. To show that x(:) is in 2,1 (9)), and there- 
fore that x(:) is in &o' (%), it is sufficient, 9) being separable, to prove that for 
any measurable E the linear functional Fz(g) over 9) is weakly continuous 
({1], p. 131). If gn(y)—>go(y) for every y in 9), then ||g,|| <K, (n=0, 1,2, - - - ); 
and if Fg is a.c., then 


| ga(x(s))da 


for a(E£) <6. The integrals /g,(«(s))da are thus equi-absolutely continuous; 
since g,(x(s))—go(«(s)) for almost all s, it follows that 


| Fe(gn) | gn -||Fel| Ke 


=. J ta(2(s))da— f = Fx(go) 


for every E. Thus for any fixed E the functional Fz(g) over 9) is weakly con- 
tinuous, so that a yz in 9) exists such that g(yz) =F 2(g) for all g in 9). Since 
any linear functional over ¥ defines one over 9), and since yz is in 9), it follows 
that Fz(f) =f(yz) for all f in X, and x(s) is integrable. 


Coro.iary 5.31. If measurable x(s) is in 2°(%), (p>1), then x(:)is in 
QP (X). If X is weakly complete, this is also true for p=1. 


Suppose p> 1. If 
Fe(f) = (ols) da = 


where ¢z(s) is the characteristic function of E, then 


if |||] $1 and <e’/[((x)),]*’. Thus <¢ when a(Z) is sufficiently 
small; hence «(:) is integrable by the theorem. 

Suppose ¥ weakly complete and p=1. From Corollary 1.12 and Theorem 
4.2 it clearly suffices to consider c.v. functions. For such an x(s) 


| 
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> f | | aa = 
E 


so that }>x,a(E,;-E) is a weakly convergent series in the weakly complete 
space ¥. Then there exists an xg such that 


f(xe) = Ey E) = J f(x(s))der, 


and «x(s) is integrable. 

Corotitary 5.32. If and then 2(L2)=27(L2) 
and Q°(12) = (12). 

6. Concerning completely continuous operations. We prove the following 
lemma: 


Lemma 6.11. If x(s) is a step-function from |0, 1] to X, the operations 
U(f) = f(x(:)) 


from to and from to are c.c. (completely continuous). 


If U is c.c., so are U and (Theorem 3.3) U*; since V(¢) =W-1(U*()), 
where W-' is linear, this implies the same property for V. Suppose x(s) =x; 
on Ej, (>-"E;=[0, 1]); then U(f) is the function having the constant value 
f(x.) on E;. If ||f,|| so that |f,(x,)| <K-||x,.||, then clearly there exists 
a subsequence {f,’ } such that the functions {f,’ (x(s))} converge uniformly 
over [0, 1] and hence {f,/ (x(:))} is a convergent sequence in L*’ for any p’. 


Lemma 6.12. If T(¢) is linear from L?, (p<), to a subspace M of finite 
dimension in then there exists an x(:) in (X) defining T(p), that is, 


1 
= 
0 
for all ¢ in L”; and x(:) is the limit in %?' (X) of step-function elements. 
Let x;, fi, ({=1, 2, - - - , m), be a complete biorthogonal system in M, so 


that 


T(6) = 66) 


where gi(¢) =f:(T(¢)). The functional f; being extensible from M to %, the 


\ 
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last equation implies that g,() is a linear functional over L?; since p< ~, 
there is an element ¢; in L”’ generating g;(#). Set 


(1) x(s) = >> Oi(s)- 
1 
Obviously x(:) is in &?’(X); and 


0 1 0 


The remainder of the lemma results from approximating, in L*’, each ¢; 
in (1) by a real-valued step-function. 


THEOREM 6.1. A necessary and sufficient condition that a linear operation 
V(¢) from L”, (1<p<~), to ¥ be c.c. is that it be the strong limit of a sequence 
of operations defined by step-functions. . 


If from L? to is c.c. and 1<p< then ({14], p. 197) V is the 
limit of operations {7,,} where each 7, maps L? into a finite-dimensional 
subspace of ¥. By Lemma 6.12 there exists a step-function x,(:) whose corre- 
sponding operation V, is in norm within 1/n of T,,, so that ||V—V,||0. 

The converse results from Lemma 6.11 and the fact that the limit under 
the norm of a sequence of c.c. operations is again c.c. 


THEOREM 6.2. If x(s) is measuralbe and integrable , then U(f) =f(x(:)) from 
X to L' and V(¢) = J'x(s)¢(s)ds from L” to are c.c. 
This follows from Theorem 4.3, and Lemma 6.11, and the fact that for 
in (%) 
|| Ul] = = 
7. Representation of integrals by real-valued kernels. We make the fol- 
lowing definition: 


DerFinitIon 7.1. (1S p’S ~, 1SqS ~~), is the class of real-valued 
functions F(s, t) defined on the unit square and having the following properties: 

(a) F(s, t) is measurable in (s, t). 

(b) For each s the function F(s, :) isin 

(c) The iterated integral 


(1) { o(s)ds 


exists for every in and ¢ in For p' = the restriction 


= 
= 
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Kr = 1.u.b. (css. sup. fF. t)y(t)dt < 


Iwi 
is imposed. 

Lemma 7.21. If x(s) from [0, 1] to L*, (1SqS ~), is weakly measurable, 
then there exists a function F(s, t) from the unit square to the reals that is measur- 
able in (s, t) and such that x(s)=F(s, :) as elements in L¢ for each s in [0, 1]. 
Any two such “measurable representations” of x(s) differ on at most a set of 
plane measure zero. 


For g< ~ this is an immediate consequence of Corollary 1.11 and a theo- 
rem of Dunford ([9], Theorem 3.1). The case g= ~ results from the preced- 
ing, since x(s), if weakly measurable to L®, is also weakly measurable when 
considered as defined to L!. 


THEOREM 7.2. If x(:) is in Q°’(L2), then any measurable representation 
F(s, t) of x(s) is in L?’2. If g< @ and F(s, t) is in L?’®, then the abstract function 
x(s)=F(s, :) is in = Qo?’ (L2). 

Since any y in L’ defines a functional fz over L2, x(:) in 2?’(L2) implies 
that 


is in L”’ and hence that (c) of Definition 7.1 is satisfied; since F(s, ¢) is a 
measurable representation of x(s), the remaining two conditions are also 
fulfilled. 

Conversely, if F(s, ¢) is in L?’2, then x(s)=F(s, :) is defined from [0, 1] 
to L*; and if g< ~, so that every linear functional f(y) over L* is generated 
by an element of it follows that f(x(s)) =f (s, is in L»’; for if 
the iterated integral in (c) of Definition 7.1 exists for all ¢ in L”, then the 
right-hand side of the above equation is in L»’ ({1], p. 85). Thus x(:) is in 
= Qo?’ (L*). 

THEOREM 7.3. If F(s, t) is a measurable representation of an element x(:) 
of then a necessary and sufficient condition that = S.x(s)o(s)ds 
from L» to L¢ be expressible as 


(2) = f FG, 


is that in and in imply that 


(3) f dt = #94 J 'F(s, as 


= 
= 
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From Theorem 7.2, F (s, 4) is in L?’*, so that the right-hand integral in 
(3) exists. The elements y of L define linear functionals f7 over L2; hence if 


(2) is true, then 
it (f F(s, = f 


fw fF. dt = fv -fi(F(s, 2))ds 


J J 'F(s, as 


On the other hand suppose that x(:) is in &o?’(Z2), so that x(:) defines 
the operation V(¢) =/ 'x(s)p(s)ds and suppose that x(s) has a measurable 
representation F(s, ¢) satisfying (3). By Theorem 3.4 the operation V adjoint 
to V assigns to each f in L¢ the linear functional over L” generated by the 
element ¢=f(x(:)) of L’. 

Let ¢ be any element of L”, and let § be the set of functionals fj over L2 
generated by elements y of L*’. Then for all f; = V(fz) with f; in § 


or 


1 1 
0 0 
or, if (3) is applied, 


= J vo-{ Fs, dt, 


where S'F(s, t)@(s)ds must be an element of L*. Then for a fixed ¢ in L?, 


whenever fj is in § and f;=V(fj). But by definition of V the equation 
=fi(V(@)) holds for all ¢ in L”; hence, for any fixed ¢, 


fa(V (6) = fa( 


for all fg in §. This implies (2), since the set § is a total set of functionals 
over 


| 
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CoROLLARY 7.31. If x(:) im %?'(L2) has a measurable representation 
F(s, t) such that Sols t)| P(t)dt}ds exists for every in L? and 
in then 

1 
0 0 
for all L?. 

This follows from Theorem 7.3 and a well known theorem of Tonelli 
([2], p. 75). 

Corotiary 7.32. If F(s, t) is in (q < then 


1 1/q 
ess. sup. (f | F(s, 2) =Kr<, 
0 


and 


= f F(s, t)6(s)ds 


is defined and linear from L' to L*. 


By Theorem 7.2, x(s)=F(s, :) is in 2°(L*) =S..; hence (Defini- 
tion 3.1) 


1 1/q 
co > Kp = ess. sup. ||x(s)|| = ess. sup. (f | F(s, it) 
8 8 0 


Clearly «(:) satisfies the hypotheses of Corollary 7.31; hence the operation 
V@)=/. 'x(s)b(s)ds can be written as V(¢) =/. \F (s, 

THEOREM 7.4. If g< ~~, if F(s, t), defined and measurable from the unit 
square to the reals, has F(s, :) in L¢ for almost all s, and if 


UD = f Fo, 


is defined from L to L', then U is linear and c.c. 

If in addition V(o)= (s, t)p(s)ds* is defined from L” to L* and 
for every pair in L*’, in L®, then is also 
linear and c.c. 

This results from Definition 7.1, Theorem 7.2, Corollary 5.32, Theorem 
7.3, and Theorem 6.2. 

8. Concerning differentiation. In this section S is taken to be euclidean, 
and a(£) is the Lebesgue measure function. 


* See [9], Theorem 6.1, where for a more restricted type of F(s, ¢) this operation is shown to be 
defined and c.c. from L? to Lt, (1<pS~, 1Sq<~). 


= 
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The relation between the Bochner integral and differentiation is that an 
additive a.c. BV function of figures is a.e. strongly differentiable if and only 
if it is a Bochner integral; and the derivative, when it exists a.e., is (Bn) 
integrable to the original function ([{18], p. 410, footnote). However ([6], 
example 1), an (%) integral may be nowhere weakly differentiable (paragraph 
1.14); there is therefore no hope for a theorem similar to the above and dealing 
with the (X%) integral and either strong or weak derivatives. But if another 
and more general definition of “derivative” is employed, the corresponding 
theorem can be stated. 

A function X(R) defined to ¥ from the figures in a rectangle Ry of eu- 
clidean n-space will be said to be pseudo-differentiable or to have a pseudo- 
derivative if there exists an x(s) from Ro to ¥ such that for every f in ¥ the 
figure function f(X(R)) is differentiable a.e. to the value f(x(s)). The function 
x(s) is a pseudo-derivative of X(R). 

A necessary and sufficient condition that an additive a.c. function X(E) of 
measurable sets in Ro be pseudo-differentiable is that the function X(E) be an 
(X) integral. 

The sufficiency is obvious; an (X) integral has the integrand as a pseudo- 
derivative. 

If X(£) has a pseudo-derivative x(s), then if X(E) is additive and a.c., 
so is f(X(£)); hence 


(X(E)) = as = f 


so that x(s) is integrable to the value X(£). 

If X is weakly complete, the preceding theorem is true for additive a.c. 
functions of figures. For if X(R) has x(s) for a pseudo-derivative, we have 
f(X(R)) =S,,f(x(s))ds; if Q is any open set, then Q=)-/,, where the {/,,} are 
disjoint intervals, and thus 


| | = f fx(s))ds | < f | | de < 


Since X¥ is weakly complete, it follows that >> X(/,,) has property (O) and there- 
fore that )>X(/,) is convergent. Designating this sum by X(Q) we have 


= = J, f(x(s))ds. 


Finally, if Z is an arbitrary measurable set, and if {Q,} are open sets con- 
taining E with a(Q,,)—a(£), then 


| 

| 
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f(X@,)) = f f(x(s))ds > f f(x(s))ds; 
Qn E 


and, again using the weak completeness of ¥, we find that { X(Q,)} converges 
weakly to an element X(Z), and f(X(E)) =/,f(«(s))ds. Hence x(s) is (¥) in- 
tegrable to the set-function X(£). 

The pseudo-derivative suggests a corresponding “descriptive” definition 
of an integral: a function x(s) is “integrable” if and only if there exists an 
additive a.c. function X(E) of measurable sets which has x(s) for a pseudo- 
derivative; the integral of x(s) over E is X(£). From the above theorem this 
descriptive integral is the (%) integral. 

If X(£) has a.e. a weak derivative x(s), then x(s) is a pseudo-derivative; 
hence if X(£) is additive a.c. and weakly differentiable a.e., this weak deriva- 
tive is (¥) integrable to the value X(£); moreover, by Theorem 1.2 the 
derivative is measurable, so that X(£) is the (¥) integral of a measurable 
integrable function, its weak derivative. We have been unable to resolve the 
converse question: does a measurable integrable function have its integral 
weakly differentiable a.e.? If so, then by Theorem 5.2 this weak derivative 
coincides a.e. with the original function. 

In conclusion we recall that if ¥ is separable and x(s) is essentially 
bounded and weakly measurable, then x(s) is (Bn) integrable, and { pr(s)ds 
=(Bn)/,,x(s)ds is therefore strongly differentiable a.e. to x(s). In particular 
if X is separable and x(s) is (Bk) integrable and essentially bounded, then 
(Bk) f,«(s)ds = f,«(s)ds is strongly differentiable a.e. to x(s). This gives an 
affirmative answer to a question raised in [6] (p. 378). 

9. Examples. We give the following examples: 

9.1. Let ¥ be the space of real funciions ¢(¢) bounded on [0, 1], with 
|||| =1.u.b., | @(¢)|. Let Z be a non-measurable set in S = [0, 1]; define x(s) =6 
for s in S—E, and set x(s) equal to the characteristic function of the point s 
for s in E. Then x(s) is Graves integrable (see [3]) over every measurable set 
E to the value @; hence x(s) is (Bk) integrable and integrable to the same 
value. The function x(s) is weakly measurable and integrable, yet ||x(s)|| is 
not measurable. And the set function X(Z) = /,«(s)ds=6 is weakly differ- 
entiable everywhere to the value @ and thus has two pseudo-derivatives, x(s) 
and the function identically 0, that differ on a set of positive measure. 

9.2. Let S=[0, 1], let a(E) = mE, where m is the Lebesgue measure func- 
tion, and let ¥=c, the separable space of convergent sequences of reals. On 
J,=(1/2", 1/2"-"] define x(s) =2"x,, where x, is the mth unit vector in c. 

The function x(s) is in @'(c), that is, x(s) is (D2) integrable, and if 
E,=E£E-J,, then 


| | 
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(D2) x(syas = Fe = , 


an element of c=/*, where /® is the space of bounded sequences. Here Fz is 
not a.c., so that, from Theorem 5.3, x(:) is not in &¢ (c);* hence in Corollary 
5.31 the hypothesis of weak completeness cannot be omitted. Nor is the set 
function Fx c.a., since the series >>,,F,, in ¢ is not even weakly convergent 
to an element; to prove this it is sufficient to consider the linear functional 
over /*=¢ defined by Hildebrandt’s generalized Stieltjes integral ([13], p. 
870) using Banach’s measure function ([{1], p. 231). 

If y(s) =x(s) on (0, 1], if y(s) = —x(—s) for s in [—1, 0), and if y(0) =8, 
then y(s) is integrable over |—1, 1] in the sense of Definition 2.1, yet y(s) 
is not (¥) integrable; this is in contrast to the (Bk) definition, since (Bk) in- 
tegrability over S implies (Bk) integrability over every measurable subset of 
S ([6], Theorem 14). 

9.3 Let {x,} be a complete orthonormal sequence in L*, and define 
a(s)=2"x, on J, =(1/2", 1/2"+1/22"| and x(s) = elsewhere in [0, 1]. Then 
x(s) is (Bn) integrable, and x(:) is in %,?(L*), while S'\\x(s)||?ds = oo, 

This example shows that in Theorem 6.2 we cannot replace &¢(%) by 
(X), sp’ < @). For in &,?(L?) there are no step-function elements within 
1/2 of x(:), so that V(@) = f\x(s)o(s)ds from L? to L? is not c.c. Suppose that 
0 <e<1/2 and that y(s) is a step-function taking the values y,, - - - , vx. If fn 
is the linear functional over L* generated by x,, then )>-,|fn(yi)|2< © for 
each 7; hence there exists an N such that | 


| | <e, émi,---,hk. 
If s is in Jy, then 


| fw(x(s) — y(s))| =| fv(2%¥anv — | 


1 > 2*(1 - 5); 


1/2 
( | — y) 
IN 
1 e \2\1/2 
22N QN 


€ 
>1-—-—>e. 
QN 


QN. 


whence 


IV 


((x — y))e 


* This example is similar to one due to Birkhoff, cited indirectly on page 378 of [6]. 
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Thus although every element of the class S,-, ({9], p. 474), composed of 
functions x(s) measurable to L¢ and having ||x(s)||?’ summable, has its V(¢) 
c.c., this is no longer true if S,-¢ is replaced by %?’(L). 

9.4. Let {x;;} be a complete orthonormal sequence in L? arranged in a 
doubly infinite array, and consider y,(s)=x,;; on E;; = [j—1/2‘, 7/25), 
(j=1, 2,---, 24, Let Since y,(:) is in 
(LZ?) and we have 


— %m))e S 
i=m+1 

as m, n—~«, so that {x,} is a Cauchy sequence in &?(L’), (1<p<2). If 
there were an x(:) in &o? (ZL) such that ((x,—)),—0, then (paragraph 3.22) 
it follows that ((x,—x)):0 and Lu.b.z || with 1/n. But as 
Birkhoff has observed (this example is example 5 of [6]), there exists no x(s) 
satisfying this last condition. Hence the space 2?(L?) =2,?(L?) is not com- 
plete for 1<pS2. 

If V, is the operation from L*’ to L?, (2<p’< ~), defined by the step- 
function x,, then V, is c.c., and V =lim V, is therefore also c.c. But V is gen- 
erated by no function integrable either (Bn), (Bk), (¥), (D1), or (D2), since 
the last four integrals have equivalent definitions when ¥ = L? and each of 
these definitions is more general than that of Bochner. 

This example also shows that for step-functions, ((x)), is not topologically 
equivalent to Moreover, if X(E) =lim, then 
is the limit of X,(Z) = f,,x.(s)ds uniformly with respect to E, and hence is a 
c.a. and a.c. function from measurable sets to L?; yet X (EZ) has a weak deriva- 
tive on at most a set of measure zero, and from a remark in §8, has no pseudo- 
derivative. Finally, none of the integral definitions that have been considered 
here serve to define the general linear or general completely continuous opera- 
tion from L? to L’. 

Conclusion. The following questions are among those we have failed to 
answer. 

Is the integral of a measurable integrable function weakly differentiable 
a.e.? 

Can the hypothesis of measurability be dropped in Theorem 5.3? In Theo- 
rem 6.2? 

Is the (¥) integral equivalent to that of Birkhoff? 

Are step-functions dense in &¢ (¥) for non-separable X? If so, then every 
integrable function defines c.c. operations from ¥ to L’ and from L® to X. 
If the (X) integral is equivalent to the (Bk) integral, the answer to this 
question is yes, by paragraph 3.22 above and Theorem 18 of [6]. 


| 
| 
| 
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UNIFORMITY IN LINEAR SPACES* 


BY 
NELSON DUNFORD 


INTRODUCTION 


The first chapter of this paper concerns itself with questions of uni- 
form boundedness of sets of points in a Banach space and sets of functionals 
on a Banach space, as well as with a group of closely related resonance the- 
orems. A well known example coming under this heading is the theorem of 
Toeplitz [36]{ stating that supm providing 
converges whenever £, does. Another is the theorem of Hahn [12] stating that 
if an arbitrary continuous function has the partial sums of its Fourier ex- 
pansion, with respect to an orthonormal sequence of bounded functions w,, 
essentially bounded, then the sequence /| >>?_ ,w,(x)w,(#) | dt is also essentially 
bounded. Still another is the theorem stating that if the adjoint of an every- 
where defined transformation between Banach spaces is everywhere defined, 
then the transformation is continuous. This was proved, at least for Hilbert 
space, by von Neumann [21], Stone [34], Tamarkin ([34], p. iv), and Stone 
and Tamarkin [35] and is probably not usually thought of as a theorem on 
uniform boundedness. 

Most of the results we have in the first chapter are new, others have been 
proved only in special cases, and some are well known but the proofs hereto- 
fore given have been different. Previous methods for discussing questions of 
uniform boundedness divide themselves into three groups (i) those associated 
with the names of Lebesgue [18], Banach [2], Hahn [11], and Hildebrandt 
[13]; (ii) those characterized by the elegant and direct use of the Baire cate- 
gory theorem as in the works of Banach [1], Saks [31], Saks and Tamarkin 
[32], and others; and (iii) those employed in a recent theorem of Gelfand [9] 
the proof of which is closely related to that of the category theorem. 

The second chapter of this paper is concerned with cases where a weak 
limiting process implies a strong one. These cases seem to be rather rare but 
probably more will come to light in the future. The theorem quoted above, 
stating that the existence of the adjoint implies the continuity of the function, 
belongs to the class of questions discussed in Chapter II as well as those dis- 
cussed in Chapter I, while its analogue (Theorem 42) in the Boolean ring of 

* Presented to the Society, December 31, 1936, under the title, Integration of vector-valued func- 


tions; received by the editors September 2, 1937. 
+ Numbers in brackets refer to the bibliography at the end of the paper. 
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Lebesgue measurable subsets of (0, 1), which states that an additive vector 
valued set function y(e) is absolutely continuous providing yy(e) is absolutely 
continuous for every linear functional y, is more characteristic of the phe- 
nomenon discussed in the second chapter. This is because the continuity of an 
additive function on a Boolean ring is not a consequence of its boundedness as 
is the case with linear operations on a Banach space. The theorem on the 
ring is closely related to the theorem of Orlicz [23] which asserts that for 
weakly complete Banach spaces the weak unconditional convergence of a 
series implies its unconditional convergence in the strong sense. Another re- 
sult in this category which has appeared recently is the theorem of Pitt [26] 
which states that a bilinear form } > £,a;;n;, if bounded for 


1/p 
= (Zale) “= (Ll ale) 

i= 
where 1/p+1/q<1, is convergent in the sense of Pringsheim as a double 
series and uniformly for ||z||,=||y||,=1. This result is not true for 
1/p+1/qg21. The theorem may be worded in terms of linear operations 
on /, and would state that every continuous linear operator on /, to /,, is com- 
pletely continuous if g > p’ 21. In this form it is closely related to Theorem 71 
which has also been proved by Pettis [24] and states that an operator on 
L” to L, (p>1), which has the form 


vo) = 


where H(s, ¢) is in L”’ for each s, is necessarily completely continuous. This 
shows that the expansion of y with respect to any complete orthonormal se- 
quence in L is necessarily convergent (in L) uniformly with respect to 
| edt=1. 

With the exception of Pitt’s result in the case where p’>1, Chapter II 
contains results considerably broader than those outlined above. Another ex- 
ample of the above type, which falls, however, more naturally into the first 
chapter is the result that the Riemann integral /'¢(é)df(d), where f(#) has its 
values in a Banach space, exists for every continuous function ¢ providing the 
integral S-o()drf(t) exists for every continuous @ and every linear func- 
tional y. Still another is that (Theorem 76) a function f(z) from a domain in 
the complex z-plane to a complex Banach space is analytic providing yf(z) is 
analytic for every complex valued linear functional y. 

In Chapter III the results of the preceding chapters are applied to a 
theory of Lebesgue integration which is broader than those heretofore given 
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and which furnishes the natural tool for the solution of a general type of 
moment problem (Theorem 61). The central idea underlying the integral is 
that a function f(p) (on a measurable set E to a Banach space Y) which has 
the property that yf() is summable for every y in a closed linear manifold 
I’ Y, defines uniquely a point in T according to the equation 


= f 


This notion is intimately related to the recent work of Pettis whose manu- 
script I had the privilege of seeing shortly before I finished the typing of my 
own. Pettis’ paper gives the reader a very interesting discussion of the case 
where '=¥Y and ¥, is in Y for every measurable subset e of E, and this case 
is undoubtedly one of the most important to be considered. Another case of 
interest which we have discussed only slightly is the case where Y itself is a 
conjugate space Z, and '=Z. Here one always has ¥, in Y. The integral in 
this case has been defined by Gelfand [9] but has not, as far as we know, been 
discussed in any detail. 

In Chapter IV a few instances of the general theorems are pointed out. 

Notation. The notation we have used is for the most part self explanatory 
or else explained where it is introduced, but it might help the reader to keep 
in mind that throughout the paper Y and Z are arbitrary Banach spaces 
while X is a Banach space subjected to three restrictions imposed at the be- 
ginning of Chapter I and later in Chapter II to a fourth restriction. The 
symbol I is always used for a closed linear manifold in Y, and vy for a point 
in I. Thus in expressions such as supjyj-1 yy it is to be understood, unless 
explicitly stated to the contrary, that y is restricted to be in I. 

We will be dealing with functions f(#) on a class T to a Banach space Y. 
In connection with these functions the following symbols occur: 


f®, f, Wh), of. 


The first quite naturally means the value of the function for the argument t. 
The second is used when we think of f as a point in an abstract space. The 
third, following the notation of E. H. Moore, we are to interpret as follows: 
It may be that for a given y in I the numerical function yf(¢) on T, when con- 
sidered as a single entity, is an element of a Banach space X whose elements 
are numerical functions on T. If so, then yf(.) represents the point in the 
Banach space X. In the last one » is a linear functional on a Banach space X 
whose elements are numerical functions on T. In Chapter III we have shown 
that the domain of a linear functional vy on X can be extended in a natural way 
to the class ¥[Y, I’] of all f(#) on T to Y such that yf(.) is in X for every y 
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in I’. The symbol vf is then the value of the function v for the argument f in 
X[Y, I']. This notation might be confusing in case ¥[Y, '] were a Banach 
space (which it sometimes is), and we wished to express the value of a linear 
functional on ¥[Y, I’] for a particular argument in ¥[Y, I']. But since we 
shall not have occasion to do this, no confusion should arise. It should be 
noted that if X =Y, then both symbols yf(.) and yf may have a meaning, 
but the meaning is in general different. 


CHAPTER I 


1.0. Uniform boundedness. Let X be a Banach space composed of nu- 
merica] functions ¢(¢), where ¢ ranges over an abstract set T. Throughout 
what follows X will be subject to the following conditions: 

(1) If di(t)+¢2(t) for t in T, then 

(2) If, for a numerical constant c, =$(t) for tin T, then chi =. 

(3) If d.—@ and $,(t)—>G«(t) for t in T, then =x. 

Let Y be an arbitrary Banach space and I a closed linear manifold in Y, 
the space conjugate to Y. The linear space ¥=X[Y, I] is, by definition, the 
space of all functions f=/(t) on T to Y such that yf(.) is in X for every y in I. 

TueoreM 1. /f fis in ¥[Y, then yf(.) is a continuous linear operation 
on T' to X. In other words, there exists a smallest non-negative number |\f|| such 


that 


By (1) and (2) we see that the operation U(y) =yf(.) on T to X is addi- 
tive, that is, 
U(evy1 + Cove) = + c2U(y2), 


for every pair ¢,, C2 of numerical constants and every pair 71, Y2 of points in I. 
Condition (3) shows that if y,—7* and U(y,)— ¢, then U(y*) =; thus by a 
well known theorem ([1], p. 41, Theorem 7), U is acontinuous operation from _ 
to X. 


THEOREM 2. Let f(t) be a function on an arbitrary set T to a Banach space Y. 


If 
sup |v f(t) | vyeY, 
te 


then 


sup ||f@|| < 
teT 


Since sup yy =||,y|| ({1], p. 55, Theorem 3) where the sup is taken over all y 
in Y for which ||y|| =1, this theorem follows from Theorem 1 by taking ! = Y 
and X = M*(T) = the space of functions bounded on T. 


G 

4 
3 
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THEOREM 3. Let Y be a Banach space and f, a function on an arbitrary 
range T to Y, the space conjugate to Y. If 


sup | fry| < yeY, 
te 


then 


sup ||fi| < 
tT 


This follows from Theorem 1 by taking X = M*(T) and replacing Y by Y 
and I by Y which can be considered as a closed linear manifold in Y and de- 
fined as all ¥ in Y expressible in the form 9(4) =9(y). 

Occasionally in what follows we shall assume that there is a notion of null 
set in T. This notion is subject to the single restriction: 

(N) A denumerable sum of null sets is a null set. 


Thus the notion of null set may be defined in terms of a completely addi- 
tive measure function, in terms of first category sets, or in terms of sets con- 
sisting of at most a denumerable number of points, and so on, or may simply 
mean a void set. Such terms as ess. sup., (¢) and the Banach space M(T) of 
essentially bounded functions on T then have a meaning. 


THEOREM 4. Let T be a set (t) of points in which there is a notion of null set 
satisfying condition (N). Let Y be a Banach space for which Y is separable. If 
f(t) on T to Y is such that yf(t) is essentially bounded for every y in Y, then 
|| f(2)|| is essentially bounded. 


In Theorem 1 take X = M(T), ['=Y; then 
ess. sup. | f()| 


If {y.} is dense in Y and 7; is the set in T where 


| > 
then 7;, and thus },7;, is a null set, and 


for every y and every tin Thus ||f(é)|| f|] on 


THEOREM 5. Let T be a set (t) of points in which there is defined a notion of 
null set satisfying condition (N). Let Y be a separable Banach space. If f, on T 
to Y is such that fy is essentially bounded for every y in Y,, then ||f,|| is essentially 
bounded. 


The proof is entirely analogous to that of Theorem 4. 


i 
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THEOREM 6. Let V be an arbitrary set (v) of points and U a set (u) in which 
there is a notion of null set satisfying condition (N). Let Y be a Banach space for 
which Y is separable. If f{(u, v) on UV to Y is such that for each y in Y there is a 
null set U, ¢ U and a constant M, such that 


| vf(u, 2) | < My, ueU — Uy, veV, 

then there is a null set Upc U and a constant M such that 
M, ueU — Us, 
In the product space T= UV of points (u, v), null sets may be defined as 


sets of the form U,V, where Up is a null set. Theorem 6 is then a corollary of 
Theorem 4. 


THEOREM 7. Let U and V be as in Theorem 6 and Y a separable Banach 
space. If fu. on UV to Y is such that for each y in Y there is a null sete U, cU 
and an M, such that 


| ueU — Uy, veV, 


then there is a null set Up ¢ U and a constant M such that 
=M, ueU — Uo, veV. 


This follows from Theorem 5 as Theorem 6 did from Theorem 4. 

The postulate of separability in Theorems 4 and 6 is not entirely neces- 
sary. They hold, for example, when Y = L, the space of summable functions, 
and thus Y = M, the space of essentially bounded and measurable functions, 
which is not separable. They hold even when Y = M and Y is therefore the 
space of bounded additive set functions ([{7], [15]). These facts we shall prove 
presently. 

Indeed they hold (in a more general form in that y is not arbitrary) 
for any space Y for which YF is the first or second conjugate of a separable 
space. As the reader will readily see from the argument, the only thing neces- 
sary is that Y have the following property: Let Y be a Banach space; then 
the conjugate space Y is said to be a fundamentally separable space (f.s. space) 
with determining manifold IT in case I is a separable closed linear manifold 
in Y such that for every y in Y and e>0 there is a y in I with ||y|| =1 and 
vy >||y|| — that is, vy =||y|| for each y. 

TueEorEM 8. If Y is a separable Banach space, then Y and Y are funda- 
mentally separable spaces. 


The space ¥ is an f.s. space, for let y, be dense in Y, and let 7, in Y be 
such that ([1], p. 55, Theorem 3) 


| 
| 
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Let I be the closed linear manifold determined by y,, (p=1, 2, - - - ). Then 
if <e/2, we have 


sup vy = sup [y(y — yp) + > || — €/2 > 
In the case of Y we can take P=Y. 
A determining manifold for M, the space of essentially bounded functions, 
is the subspace C of continuous functions. This is a consequence of the for- 
mula 


f = f 


(where f(#) is an absolutely continuous function, g(é) is a continuous function, 
and the integral on the right is the Riemann-Stieltjes integral) together with 
the fact that 


sup gdf = (total variation of f) = f | #’(o)| dt. 
lo(@)|s1 


A determining manifold for the space M of additive set functions ¢(£), with 
||¢|| =total variation of ¢, is the subspace consisting of the absolutely con- 
tinuous set functions. This set is isomorphic to L. Similarly, a determining 
manifold for the space BV of functions f, which are of bounded variation and 
are normalized so that the total variation of f is the norm of the linear func- 
tional {gdf on C, is the subspace of absolutely continuous functions. As a final 
example of a non-separable f.s. space we mention the space BV [16]. A de- 
termining manifold is the space C itself, that is, the set of linear functionals 
on BV which are expressible in the form 


vin = f sods, 


where g is continuous and |||] =sup, | g(é)|. 
Theorems 4 and 6 become then, in their more general form, Theorems 9 
and 10 below. 


THEOREM 9. Let T be a set (t) of points in which there is a notion of null set 
satisfying condition (N). Let Y be an f.s. space with determining manifold T. 
If f(t) on T to Y is such that yf(t) is essentially bounded for every y in T, then 
|| /(2)|| is essentially bounded. 
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THEOREM 10. Let V be an arbitrary set (v) of points and U a set (u) in which 
there is a notion of null set satisfying condition (N). Let Y be an f.s. space with 
determining manifold T. If f(u, v) on UV to Y is such that for each y in T there 
is a null set U,c U and an M, such that 


| vf(u, »)| My, ueU — Uy, veV, 

then there is a null set Upc U and a constant M such that 
»)|| < M, ueU — Uo, veV. 
1.1. Functions of bounded variation. Let A be a finite number of non- 


overlapping intervals (a;, 6;) on (a, 6). A function f(P) on (a, b) to the Banach 
space Y is said to be of bounded variation* on (a, b) in case 


sup < ©, 


where Af =)>(f(0;) —f(a;)). This is equivalent, in the case of numerical func- 
tions, to saying that the sum }-|f(b;) —f(a,)| is bounded in A. This notion of 
bounded variation is too broad for some purposes. For instance in non-separa- 
ble spaces a function may be of bounded variation and nowhere continuous, 
for example, f(¢) on (0, 1) to the space of bounded functions defined as 


1,0<st, 


45 1. 


K(s,) = 


This cannot happen in separable spaces. But regardless of the space we have 
the following theorem: 


THEOREM 11. Let f be a function of bounded variation on the interval (a, b) 
to a Banach space Y. Then the Riemann-Stieltjes integral [ $(s)df(s) exists for 
every continuous function ¢. 

Let 7! = (6,'), 7? =(6,?) be two partitions of (a, b) with norm so small that 
the oscillation of ¢ on any 6, or 6,? is less than e. Then if 7,! € 6,’ and 7,? € 6,2, 


where 6,'6,2 is the ordinary product of the two intervals 6,', 6,?, and 
6,.'6,2f=0 in case 6,16,? is empty. 

Now to show that this sum is in norm less than or equal to 2e sup, ||}, 6/||, 
which is all that is needed, the following lemma will suffice: 


* See Gelfand [9] where a corresponding notation is introduced. 
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Lema. Let a=(a;, d2,---, Gn) be a variable point in n-space with norm 
\|a|| =sup; |a;|. Let v1, --- , yn be m fixed points in a Banach space Y. Then the 
linear operation 

U(a) = ays 
on n-space to Y has its norm 


1€0 


where o stands for any set of integers between one and n. 


By a well known theorem ([1], p. 55, Theorem 3) 
= sup||yU||, where yeY. 


Now and 


n 
i=1 


all sup || 
where o,+ (a>) is the set of integers for which yy; =0 (<0). This completes the 
proof of the existence of /¢df. 
Consider the space BV, of numerical functions ¢(P) of bounded variation 
on (a, b) with ¢(a) =0* and with ||¢|| =f, | 
THEOREM 12. Jf f(P) on (a, b) to the Banach space Y is such that yf(.) 
_ is in BV, for every y in Y, then f is of bounded variation on (a, b). 


In Theorem 1 take T =(a, 6), X = BV, so that 


b 


and thus ||4f|| 

We should like to point out another proof of this theorem. In order to put 
it in the notation of the previous theorems we use # in place of A and put 
F(t) =tf. Then yF(t) =tyf is in the space m(T) for every y in Y; hence Theo- 
rem 12 is a corollary of Theorem 2 applied to F(?). 

This latter point of view will save some time for it makes the following 
three theorems corollaries of Theorems 3, 7, and 10, respectively. 


* This restriction is merely for simplicity of notation. 


) 
% 
14 
4 
| 
> 
rf 


314 NELSON DUNFORD [September 


THEOREM 13. If fp on (a, b) to Y is such that f py is in BV, for each yin Y, 
then fp is of bounded variation on (a, b). 

THEOREM 14. Let U bea space in which there is a notion of null set satisfying 
condition (N), and let Y be a separable Banach space. If fu,p on U(a, b) to Y is 
such that for each y in Y there is a null set U,c¢ U and an M, such that 


b 
f | drfu.py | M,, ueU U,, 


then there is a null set Up ¢ U such that f..,p is of bounded variation in P on (a, b) 
uniformly with respect tou in U—U 

The v in Theorem 7 plays the role here of ¢ or A, and Theorem 7 applied to 
the function F,,,=vf.,p gives the present theorem. Similarly Theorem 10 
gives the theorem: 


THEOREM 15. Let U be a set (u) in which there is a notion of null set satisfy- 
ing condition (N). Let Y be anf.s. space with determining manifold T. If f(u, P) 
on U(a, b) to Y is such that for each y in T there is a null set U, ¢ U and an M, 
such that 


then there is a null set Uj ¢ U such that f(u, P) is of bounded variation in P on 
(a, b) uniformly with respect tou in U—U,. 


THEOREM 16. Jf f(p) on (a, b) to Y is such that the Riemann integral 


f oman) 


exists for every continuous function o and every y in Y, then the Riemann in- 
tegral 


b 
f oman) 


exists for every continuous function >. 


In view of Theorems 11 and 12 it is sufficient to show that the existence 
of the Riemann integral { ¢(p)dy(p) of an arbitrary continuous function with 
b 
respect to the real function y implies* [| dy| <. 
* This fact is probably well known. It is stated without reference in the introduction of a paper 


of Pollard [27]. Not knowing where it is proved and because the above proof is another simple 
application of the principle of uniform boundedness we give the details here. 
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If y is not of bounded variation on (a, b), a sequence 7, of partitions of 
(a, 6) with norm approaching zero can be formed in such a way that each 
partition 7, is composed of intervals 6;, 6/ with 


>n. 


From the principle of uniform boundedness, however, there is a constant M 
independent of m such that for an arbitrary continuous function ¢(p) we have 
the inequality 


o(7 + i’)6; =M | (p) |, 

aSpsb 
where 7; (7/ ) is a point in 6; (6/ ). For any n> M let ¢ be the continuous func- 
tion which vanishes on 6/ and on 6; has for its graph an isosceles triangle 
with base 6; and height one. Then if 7; is the center of 6;, we have 


|= | s 


a contradiction. 
In a similar manner it is possible to demonstrate the following theorem: 


THEOREM 17. /f f, on (a, b) to Y is such that the Riemann integral 


f 


exists for every continuous function o and every v in Y , then the Riemann integral 


f 


also exists for every continuous function ¢. 


For simplicity of statement we have avoided a generalization which might 
be made in Theorems 2 to 15 inclusive (except 8 and 11). In all of these 
theorems the set of y|y] for which yf(t)[f:y] is bounded, or essentially 
bounded, need not be assumed to be the whole of ['[Y] but merely a set 
of second category in [VY]. We have so far only used Theorem 1 in the case 
where X = M(T) or M*(T) and each of these spaces is a special case of a 
Banach space X where the following condition holds. If ||¢,|| <M for 
p=1, 2,--- and $,(t)—-9(t) for every t in T, then o is in X, and |\¢|| <M. 
For Banach spaces X satisfying this condition it is readily shown that if 
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yf(.) is in X for every y in a set of the second category in I, then yf(.) isin X 
for every y in I. For the set [,¢ I’, where yf(.) is in X, is a linear set and 
contains a sphere in IT since it is a second category sum of closed sets 
r,=I'[||yf(.)||x <<]. Thus in the case of a space X satisfying the above con- 
dition, Theorem 1 can be worded as follows: 


THEOREM 1’. Let X be a Banach space satisfying, besides the conditions (1), 
(2), (3), the condition stated in the preceding paragraph. If the function f(t) on 
an arbitrary set T to a Banach space Y is such that yf(.) is in X for every y in 
a set of second category in the closed linear manifold 1 ¢ Y, then f is in ¥|Y, T] 
and yf(.) is a continuous operation from T to X. 


From this statement of Theorem 1 the generalization mentioned above is 
obvious. 

1.2. Adjoint operations. It has been proved by von Neumann [21], Stone 
([34], p. 61, Theorem 2.26), Tamarkin ([34], p. iv), and Stone and Tamarkin 
[35] that an additive operation on Hilbert space to Hilbert space is continu- 
ous providing its adjoint is everywhere defined. The corresponding theorem 
for Banach spaces, together with a number of similar theorems, can be ob- 
tained from Theorem 1 by specializing the range T, the function f, and the 
space X. As the reader will see, these theorems are all of a general type assert- 
ing that a limiting process when existing in a weak sense will also exist in a 
strong sense, and he may therefore expect to find them in Chapter II where 
such phenomena are discussed. Since the operations involved are additive 
functions and for such functions continuity is equivalent to boundedness, 
and since the proofs are entirely characteristic of the preceding proofs, we 
prefer to group these theorems with those on uniform boundedness. 

In this section Y and Z are arbitrary Banach spaces. 

By a determining manifold in Y will be meant a closed linear manifold T 
in Y such that 


sup vy = yeY. 


lyl=1 


We shall use the letter [ for a determining manifold in Y. It will not be as- 
sumed that I is separable unless it is so stated. The symbol I* will be used 
for a set of elements in Y (not necessarily in I’) such that for every y in T 
there is a sequence 7,* of finite linear combinations of elements of ['* such that 
vy — vy for every y in Y. The symbols y, y*, u, with or without subscripts 
or superscripts, will always stand for points in I’, '*, Z, respectively. 

We shall be considering functions y=f(z) on a set D(f) (domain of f) dense 
in Z and with values in Y. The adjoint f of f is a function on a subset of Y 


a 


1938] LINEAR SPACES 317 


with values in Z. The domain of definition D(f) of f consists of those 7 in Y 
for which there exists a uw such that 


5f(z) = wz, ze D(f). 


Since D(f) is dense in Z, u is unique. The function f is then defined by the 
equation 
= 4, je D(f). 
THEOREM 18. If the domain D(f) of y=f(z) is dense in Z, and if the adjoint 
of f is defined for every y in T, then D(f)=Y, f is continuous, and the domain 
D(f) may be extended to the whole of Z in such a way that the extended function 
f(z) is a bounded linear operator with norm the same as that of f. 
First note that Theorem 2 holds if Y is replaced by a determining mani- 


fold T in Y. Now in Theorem 2 take 7 as the set of all z in D(f) with ||z|| <1. 
By Theorem 2 


= ze D(f), <1. 


The rest of the proof is obvious, and we leave it to the reader. 


THEOREM 19. [f the adjoint of y=f(z) on Z to Y is defined for every y* in I, 
then f and f are bounded linear operators with the same bound. 


It is a well known corollary of Theorem 3 that if a sequence {u,} of linear 
functionals converges for every z, then the limit is a linear functional. Thus 
the adjoint f is defined for every y in T’; and this theorem is a corollary of the 
preceding one. 


THEOREM 20. If the additive function y=f(z) on Z to Y has the property 
that y*f(z) is continuous for every y* in I*, then f is continuous. 


This is merely a restatement of Theorem 19, and it is in this form that we 
prefer to state the further theorems of this type. 

We note in passing that for Banach spaces Y which are equivalent to their 
own conjugates it is possible to define the notion of a symmetric transforma- 
tion and obtain a corollary to Theorem 19 which states that if an additive 
symmetric transformation is defined everywhere, then it is continuous. For ex- 
ample a function f(y) on D(f)¢ Y to Y (where Y=Y) might be defined as 
symmetric if it obeys the law 


y'f(y) = y, D(f). 


THEOREM 21. If the additive function f. on Z to V has the property that f.y 
is continuous in z for each y in a fundamental set in Y, then f, is continuous. 


i 
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This follows from Theorem 20 by replacing Y by Y, I by Y, and I* bya 
fundamental set in Y. 

THEOREM 22. If f and f,, (n=1, 2,- ~~), are additive functions on Z to Y 
such that y*f,(z) is continuous for each y* in T* and every integer n, and if 

1*fn(2) x*f(2) y*e zeZ, 

then f is continuous. 

This follows from Theorems 3 and 20. We leave the details of this theo- 
rem, as well as the next, to the reader. 

THEOREM 23. If f, and fr, (n=1, 2, --- ), are additive functions on Z to Y 
such that {2 (y) is continuous in z for every y in a fundamental set in Y and every 
integer n, and if 


lim = fly), 


for z in Z and y in a fundamental set in Y, then f, 1s continuous. 


THEOREM 24. Let S be an arbitrary set (s) of elements, and let f(z, s) on ZS 
to Y be such that 
(i) for each y in and z in Z, yf(z, s) is bounded on S; 
(ii) for each y in T and s in S,yf(z, s) is a continuous linear functional in z. 
Then f(z, s) is continuous and linear in z uniformly with respect to s; that is, 
there is a constant M such that 


sup || f(z, s)|| < zeZ. 


Let T =ZS, and take X as the space of all real functions y(z, s) on T which 
satisfy the following conditions: 

(a) For each z, u(z, s) is bounded on S. 

(b) For each s, u(z, s) is a continuous linear functional on Z. 

Then by Theorem 3 


sup ||u(., s)|| = sup | u(s, s) | < 
s =1 


This constant is taken as the norm of a point in X. Theorem 24 is then a 
corollary of Theorem 1. 


THEOREM 25. Let S be an arbitrary set of elements (s) and f,,,0n ZS to Y be 
such that 
(i) for y in Y and z in Z, f.,.(y) is bounded on S; 
(ii) for yin Y and s in S, f.,.(y) is continuous and linear in z. 
Then f.,, is linear and continuous in z uniformly with respect to s, that 1s, 
there is a constant M such that 


4 
< 
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This follows from Theorem 24 by replacing Y by Y andT by Y. 


THEOREM 26. Let S be an arbitrary set (s) in which there is defined a notion 
of null set satisfying condition (N). Let Z be separable and f(z, s) on ZS to Y 
such that 
(i) for each y in T and z in Z there is a constant M(vy, 2) and a null set 
S(y, z) such that | f(z, s)| <M(y, 2) on S—S(y, 2); 
(ii) for each y in T and s in S, yf(z, s) is continuous and linear in z. 
Then there is a constant M such that 


ess. sup. sup | yf(z, s)| < M|lr]|; 
8 lzll=1 


and if T is separable, there is a null set So such that 
= seS — So, 


Take T=ZS and X as the space of all real functions on T of the form 
u(z, Ss), where 

(a) for each s, u(z, s) is continuous and linear in z; 

(b) for each z, u(z, s) is essentially bounded in s. 

Then by Theorem 5 


ess. sup. ||u(., s)|| = ess. sup. sup | u(z, s)| < ©, 


and this constant is taken as the norm in X. The first conclusion follows from 
Theorem 1. If {7;} is dense in I’, and if S; the set in S where 


sup | vif(z, s)| > 


then the second conclusion follows by taking So=)>0/., Sj. 


THEOREM 27. Let S and Z be as in Theorem 26, and f.,,0n ZS to Y be 
such that 

(i) for each y in Y and z in Z there is a constant M(y, z) and a null set 
S(y, 2) such that | f.,.(y)| <M(y, ) on S—S(y, 2); 

(ii) for each yin Y and sin S, f.,.(y) is continuous and linear in z. 

Then there is a constant M such that 


ess. sup. sup | fz.e(y)| < 


If Y is separable, then there is a null set So such that 
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This follows from Theorem 26 by replacing Y by Y andT by Y. 
The theorems of this section may be conveniently applied to the theory of 
multilinear forms. One has, for example, the following theorem: 


THEOREM 28. Let Y, Z, Z’ be Banach spaces, and f(z, 2’) on ZZ’ to Y be 
additive in each argument and such that for each y* in 1*, y*f(z, z’) is continuous 
in each of the variables z, z’ separately. Then there is a constant M such that 


From the continuity of y*f(z, 2’) in z and 2’ separately follows that of 
vf(z, 2’) for any y in I’. The desired conclusion follows from Theorem 24 by 
taking S as the unit sphere in Z’. A corollary is the following: 


THEOREM 29. Let f.,.,0n ZZ’ to Y be additive in each argument and such that 
for each yo in a fundamental set in Y, f.,2(yo) is continuous in 2, 2’ separately. 
Then there is a constant M such that 

<= 

We shall leave further applications of this sort to the reader. 

1.3. Concluding remarks. Although we shall not have much occasion to 
use the fact, it might be pointed out that for certain Banach spaces X, the 
corresponding space ¥[Y, I'] (where I is a determining manifold in Y) is it- 
self a Banach space. Suppose the Banach space X satisfies, besides the condi- 
tions (1), (2), (3), the further condition: 

(A) If ¢, then $,(t)— o(t) for each t in T. 

It follows from this that ¢(#)=0 on T providing ¢=9, and also that for 
fixed t, vp =(t) is a linear functional on X. Thus 


| o(¢)| lve 
This shows that if s is an arbitrary parameter and 
lim lon — dm || = 0 
uniformly in s, then for each " 
| — — on || + 0 


uniformly in s. Now suppose I is a determining manifold in Y and {f,} is 
a Cauchy sequence of points in ¥[Y, ']. Then 


lim sup |lyfm(-) — vfa(.)|| = 0, 


and thus 
lim sup | m(t) — | = 0, for each 


mn 
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so that the sequence f,, defines uniquely a function f(#) on T to Y such that 
f,(t) > teT. 


It is readily shown that f is in X[Y, I] and that ||f, —f||-0 where the norm 
is taken in the space ¥. Thus ¥[Y, I'] is complete. This, together with certain 
other obvious facts, shows that ¥[Y, I] is a Banach space. 

Now let U(y) be an arbitrary continuous linear operation from T' to X, 
and suppose that Y=Y. Then v,U(y), being continuous in y, is represent- 
able as v.U(y) =7(yv), where 7 is a point of T’. By the Hahn-Banach theorem 
on the extension of linear functionals, there is a point ¥ in Y such that 
¥(y) =7(y) for y in I’; and since Y = Y, we have for each ¢ in T a point f(¢) in 
Y such that »,U(y) =yf(é) and thus U(y) =y/f(.). This shows that for spaces 
Y equal to their second conjugates the operation of Theorem 1 is the general 
linear operator from I to X. To summarize we state the following theorem: 


THEOREM 30. Let X be a Banach space which satisfies, besides the conditions 
(1), (2), (3), the further condition (A). Let T be a determining manifold in Y. 
Then the space X[Y, '] with the norm ||f|| of Theorem 1 is a Banach space. If 
in addition Y =/Y, then every continuous linear operator U(y) on T to X is ex- 
pressible in the form j 


U(y) = vf(.), 
where f is a point of ¥[Y,T]. 


The condition (A) will not be assumed at any other place in this paper. 


CHAPTER II 


2.0. Uniform limiting processes. Most of the applications of Theorem 1 
that have been given thus far have been of the general type asserting that 
boundedness in a weak sense implies boundedness in a strong sense. The ap- 
plications of the present chapter are to cases where certain limiting processes, 
when existing in the weak sense, also exist in the strong sense. This is closely 
related to, and in many cases synonymous with, the statement that the opera- 
tion of Theorem 1 is not only continuous but completely continuous. This is 
the case when X is/ or L, '=Y, and Y isa separable space equal to its second 
conjugate. Cases other than those in this chapter where a weak limiting proc- 
ess implies a strong one will be found in Chapter IV. 

Before proceeding to the results of this chapter we desire to point out the 
connection between Theorem 32 and a theorem of Orlicz-Banach. Orlicz [23] 
has shown that for weakly complete spaces the unconditional (that is, ab- 
solute in this case) convergence of >>, Yn for every y in Y implies the un- 
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conditional convergence of }-y,. Banach ([1], p. 240) states this theorem in 
a more general form, namely: If all partial sums of }y, converge weakly to 
an element in Y, then >-y, converges unconditionally. In this form it is not 
assumed that Y is weakly complete, and the proof can be carried out in a 
fashion similar to that of Orlicz.* 

In view of the known conditions for conditional compactness in / (see, for 
example, [5], Theorem 2) we can give the following proof of the Orlicz- 
Banach theorem. Let S be the unit sphere in Y and U(y) = {yy,} on Y tol 
which, by Theorem 1, is continuous. It is also completely continuous, for 
from any sequence in S there is a subsequence 7; converging on the closed 
linear manifold determined by y,, (n=1, 2,---). Since Theorem 1 shows 
that the sequence {U(y,)} is bounded, to show it convergent weakly and 
hence in /, it is sufficient to show that fU(y;) converges for every f in a funda- 
mental set in /. Such a fundamental set is the set of characteristic functions 
of sets o of integers. For such an f=f,, 


neo 


and this converges since y, is in the closed linear manifold determined by y,. 
Thus U(S) is conditionally compact; therefore 


lim | = 0 


N n=N 


uniformly on S, which shows that >~,.,V.=¥e for every set ¢ of integers. 

Theorem 32 to follow is a generalization of the Orlicz-Banach theorem, 
but the proof even in the case considered by these authors is different from 
that of Orlicz. 

We shall now assume that X is a Banach space of numerical functions 
¢(p) on a range P satisfying, besides the conditions (1), (2), and (3) (with T 
_ replaced by P), the further condition: 

(4) If o; is in X, (i=1, 2,---), and o:(p)— o(p) for p in P, and if vd; 
converges for every v in X, then > is in X and $,— in X. 

It is evident that / has this property, for weak and strong convergence in / 
are equivalent. Another space in which we shall be interested, which satisfies 
conditions (1) to (4), is the space L(EZ) = L(E, a). In this symbolism E is an 
abstract set, and a a completely additive measure function defined on a 
o-field a(£) of “measurable” subsets of E. The space L(£) is then the space 


* Pettis has also given a proof of this theorem; see [24]. 
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of numerical functions on E summable on E with respect to the total varia- 
tion |a| of a, and the norm of a point ¢ in L(£) is 


loll = f lal. 


The space L(E) enjoys the property (4), for if vp; converges for every v 
in L(E), we have in particular 


J 


existing for every measurable subset e of E. Thus by a theorem of Vitali- 
Hahn-Saks [31] the integrals are equi-absolutely continuous. It follows im- 
mediately that ¢ is in L(E) and ¢;— ¢ in L(E). 

In what follows in this chapter P = (p) and T = (é) are arbitrary sets, X is a 
Banach space of numerical functions ¢(p) on P which satisfies conditions (1) 
to (4), and F is a fundamental set in X. The space is an arbitrary Banach 
space, I’ is an arbitrary closed linear manifold in Y, y(, é) is a function on PT 
to Y, and Y, is the closed linear manifold in Y determined by y(P, T). The 
symbol M(X) will be used for the space of numerical functions ¢(p, ¢) in X 
for each ¢ in T with ||¢|| =sup, ||¢(., @)|| x< 2. The following assumptions 
will sometimes be made: 


I. For every p in P the set y(p, T) is conditionally compact. 
II. Either Yo is separable, or every bounded sequence in T contains a sub- 
sequence y; such that converges for every yo in Yo. 


Note that the first of the two alternative assumptions in II implies the 
second, and that I implies II in case the range P is denumerable. 
In the following theorem the domain D(U) of U is the set of all uw in Y 
for which U(u) =py(p, is in M(X). 
THEOREM 31. Assume I and II, and that 
(i) T¢D(U), and 
(ii) for f in F there is a y,;(t) on T to Vo with y,(t) conditionally compact and 


fry(., = vy), vyeT,teT. 


It follows that 
(iii) if {y:} is a bounded sequence in T, ua point of Y, and yiyo—uyo for yo 
in Yo, then pis in D(U) and U(y;) approaches U(u) in M(X); 
(iv) in case T=Y and yjyo—0 for yo in Vo, then U(y;)90 in M(X); and 
(v) U(y) is completely continuous on T to M(X). 


‘ 
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To prove (iii) we have by Theorem 1 
sup 


so that 
sup < M. 


For f in F, friy(., 4) =yi;(), and this sequence converges for each ¢. Thus 
since y;y(., #) is a bounded sequence in X, it is weakly convergent. Also 
viy(p, for every p and Hence for each wy(., #) is in X and 
viy(., in X. Thus 


so that U(u) is in M(X). If U(y;) does not approach U(u) in M(X), there are 
sequences and and an e>0 such that 


tallx = 2M, 


where A, = yi, and =A,y(., t,) isa point in X. 

To obtain the contradiction we shall show that 

(a) ¢,(p)—> 0 for each p in P, and 

(b) fo, converges for every f in X. 
Since V,=y(p, T) is conditionally compact it is totally bounded and is thus 
covered by a finite number of spheres K(y;, 5), ({=1, 2, - - - , ms), with centers 
y; in v, and radii 6. There is a gs such that 


| | S45, i=1,2,---,m;q2q%, 
and for each y in V, there is an i such that || y—y,|y <6. Thus for any yin V, 
| S| — | +| | 


<6 [sup + i], 


This shows that A,y— 0 uniformly on V,; hence (a) is true. Since ||¢,||x <2M, 
it suffices, in proving (b), to show that /¢, converges for every f in F. Now 
from (ii), and the fact that we have 


Soa fray, ta) 


and since y,(7) is conditionally compact, it follows as above that f¢,— 0. This 
completes the proof of (iii). 

To prove (iv) note that while ||y,|| may not be bounded, the sequence 
l|-v:||o=suptyot_1 |-Yivo| (where yo is in Yo) is bounded. Thus by a theorem of 
Hahn-Banach ([1], p. 27) on the extension of linear functionals, there is a 


H 
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sequence {d;} in Y such that {\,} is bounded in Y and ); coincides with 7; 
on Y>. Conclusion (iv) is then a corollary of (iii). In view of II, for every 
bounded sequence in I" there is a subsequence 7; and a functional » such that 
viy—py on Yo. Thus (v) also follows from (iii). 

Theorem 31 enables us to state eight theorems some of which will be of 
use later in the theory of integration. These theorems deal with the following 
Banach spaces (where JT =(f) is an arbitrary set of elements): 

E, is the space composed of sequences of numerical functions ¢,(#) such 
that sup; >.,_,|¢n(é)| <©. The norm is then defined as 


= sup |. 


Ez is the subset of FE, for which 


lim sup >>| ¢n(¢)| = 0. 
N=2 t n=N 


The norm in £; is the same as that in &,. 

E;: Suppose that in T there is a notion of null set satisfying condition (N). 
Then E; is the set of sequences of numerical functions ¢,(#) such that 
SUP. > The norm is 


lol] = ess. sup. 


n=1 


E, is the subset of E; for which 


lim ess. sup. a on(t)| = 0. 
t n=N 
The norm in £, is the same as in E;. 

Let E be an abstract set, and a a completely additive numerical measure 
function defined on a o-field a(E) of “measurable” subsets of E. Let | a (e) 
be the total variation of a on e, and suppose that |a| (EZ) <<. 

E; is the space of numerical functions ¢(p, #) on ET such that, for 
each ¢ in T, ¢(p,#) is summable relative to |a| on E and such that 
sup: J,,|¢(p, #)|d|a| <2. The norm is then 


lll = sup f 
t E 
E, is the subspace of £; for which 


lim sup =0. 


Ja|(e)=0 


= 
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The norm in £, is the same as for Es. 

E;: In case there is a notion of null set in T satisfying condition (N), then 
E;, is the space of numerical functions on ET such that for almost all ¢ in 7, 
¢(p, ¢) is summable with respect to a on E and 


lol] = ess. sup. f <=. 
E 
E; is the subspace of E; consisting of all functions for which 


lim ess. sup. =0. 
t E 


Ja|(e)=0 


The norm in &; is the same as in E;. 
In case the set T has but a single element, the first four of these spaces re- 


duce to / while the last four reduce to L=L(E, a). 
THEOREM 32. Let {y,(t)} be a sequence of functions on T to Y such that 
(i) for every set o of integers there is a y,(t) on T to Vo such that 
DX vynlt) = vye(t), teT, yeT; 
(ii) for every y in T, U(y)={ryn(t)} is in E,; and 
(iii) for every n and o the sets y,(T), yo(T) are conditionally compact. 


Then it follows that 
(iv) the transformation U(y) on T to E, is completely continuous; 


(v) for each tin T 
lim vya(t)| = 0 


n>N 
uniformly with respect to ||y|| <1; 

(vi) the set T of linear functionals for which it is assumed that U(y) is de- 
fined can be extended to include any y in Y for which there exists a sequence ¥; 
in T with ||y;|| bounded and y:iy,(t)— vy,(t) for every n and t. If y; and y are 
such functionals, then 

lim sup viyn(t) — vya(é)| = 0; 


n=1 


(vii) in case T=Y and y;y— 0 for y in Vo, then 


lim sup >>| viva(t)| = 0; 
t 


n=1 


(viii) in case T is a determining manifold in Y, then for every set o of integers 
the series > neeVn(t) converges on T to y,(t). 
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Most of these conclusions follow immediately from the preceding theorem. 
Here the set P is replaced by the set of integers so that condition II is auto- 
matically satisfied. Conclusion (v) follows from (iv) by applying the condi- 
tions for compactness in /, and conclusion (viii) follows immediately from (v). 

It should be pointed out that if ['=Y, and Y is weakly complete, then 
in hypothesis (i) the existence of y,(t) is implied by the convergence of 
donee Yn(t). Also if T has but a finite number of points, then (iii) is automati- 
cally satisfied. Thus if Y=T, if Y is weakly complete, and if X =1, the trans- 
formation of Theorem 1 is completely continuous. Similar remarks hold for the 
following theorems: 


THEOREM 33. Under the hypothesis of the preceding theorem, except that now 
it is assumed that U(y) is in Ez for every y in T, the conclusions (v) and (viii) 
can be strengthened to the following statements: 

(v’) uniformly with respect to ||y\| <1, 


lim sup | | = 0; 
N t 


(viii’) in case T is a determining manifold in Y then, for every set o of in- 
tegers, the series > converges uniformly on T to y,(t). 


If (v’) were not true, there would exist sequences V;—, #;, and y:, an 
e>0O, and a functional y (perhaps not in I’) such that 


llvd| vivyo>vy0, yoeYo; O<e< Dilvylt)|, i= 1,2,---. 

By (vi) of the preceding theorem U(y;)—U(v) in E,, and, since E, is a closed 

linear manifold in £,, U(y) must be in E2. Thus 


lim sup | | S lim sup | viyalt) — | 
N- 


= lim sup vya(é)| = 0, 
t t N; 


which is the desired contradiction. Conclusion (viii’) is a corollary of (v’). 


THEOREM 34. Assume I and II and that U(y) =vy(p, t) on T to Es is such 
that 
(i) for every e in a(E) there is a y.(t) on T to Yo with y.(T) conditionally 
compact and 


vy elt) = f t)da, 
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Then 
(ii) if {ys} is a bounded sequence in T, wa point of Y, and yivouyo 
for yo in Vo, then U(u) is in E; and 


lim sup f | t) — py(p, t)| d| «| +0; 
t E 


(iii) for each t 
lim | yy(p, a| = 0 
Ja|(e)=0 e 
uniformly with respect to ||y|| <1; 
(iv) U is completely continuous; 
(v) in case E=(0, 1) and a is Lebesgue measure, then for each t 


1— 


h 
lim | + h, t) — dp = 0 
0 
uniformly for ||-y\| <1. 


This follows from Theorem 31. Here F is the set of characteristic functions 
of measurable sets e. In view of II, conclusion (iii) (which however is not as 
strong as (iv)) follows from (ii), and (v) is a known condition for compactness 
in L [37]. 

THEOREM 35. If, in addition to the assumptions of the preceding theorems 
we assume that for each y in T 


lim sup = 0, 
lal(e)=O ¢ e 


then this limit exists uniformly with respect to ||y|| <1. 


If the conclusion were not true, there would exist sequences #;, Yi, é: 
(with | (e;)— 0), an and a (perhaps not in I’) such that 


vi = 1, 
ViYo— yoe Vo, 


and 


By the preceding theorem U(y) is in E; and U(y:)-U(y) in Es. But since 
is a closed linear manifold in E; and U(y;) is in Eg, it follows that U(y) is in 
E, and thus 
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lim sup f | viv(p, ti) | d| lim sup f | t) — viy(p, d| 
t E 


+ tim sup | =0, 
t 


which is a contradiction. 
The following two theorems deal with transformations U(y) =vy,(¢) on T 
to E; and E,, respectively. By T, will be understood the set in T for which 


vya(t) | < || 


Then by Theorem 1, T—T, is a null set. 
THEOREM 36. Let y,(t) be a sequence of functions on T to Y, T a separ- 
able closed linear manifold in Y such that 
(i) for every set o of integers there is a y,(t) on T to Yo such that for every y 
inT, 
Li = teTy; 
(ii) for every y in T, U(y) =vyn(t) is in E;; and 
(iii) for every n and o the sets y,(T), yo(T) are conditionally compact. 
Then it follows that 
(iv) in case {y:} is a bounded sequence in T with yiyo—myo on Yo, U(u) 
is in E; and U(y;)-U(u) in EBs; 
(v) U is completely continuous; and 
(vi) in case T is a determining manifold in Y, there is a null set To such that 
for every o we have 


yn(t) yo(t), te 


neo 


THEOREM 37. If in addition to the conditions of the preceding theorem it 1s 
assumed that U(y) is in E,, then 


lim ess. sup. vyn(#)| = 0 
N N 


uniformly for ||y|| <1. For every o 


= yo(t) 


neo 


uniformly on T—T>. 
To prove Theorems 36 and 37 let T>=)_;-, (7 —T>,), where {y:} is dense 
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in I’ so that 7) is a null set. Now let ¢ be fixed in T—T 5, and assume e>0 
and WN an integer. There is then an z such that 


DX | vyalt) — viynlt)| < €; 


N 
| S e+ | + |] UI] 


n=1 
which shows that 


sup >>| vya(t)| < || ver. 
teT-—T) n=l 


Theorems 36 and 37 now follow from Theorems 32 and 33 applied to the 
set T—T>. 

The following two theorems deal with a transformation U(y) =vy(, #) 
on I to E; and Es, respectively. Here T, is the set in T for which 


By Theorem 1, 7—T, is a null set. 


THEOREM 38. Assume I and II, T separable, and U(y) =vy(, t) on T to 
E; is such that 
(i) for every e in a(E) there is a y.(t) on T to Yo with y.(T) conditionally 
compact such that for y in T 


yet) = f vy(p, on Ty. 


Then it follows that 

(ii) if {yi} is a bounded sequence in T and yiy—py for every y in Yo, then 
U(u) is in E; and U(y;)-U in 

(iii) for almost all t 

lim, J 9/4] a| = 0 

uniformly for ||y|| <1; 

(iv) U is completely continuous; and 

(v) in case E=(0, 1) and a is Lebesgue measure, then for almost all t 


1—h 
tim + | dp = 0 
=0 0 


uniformly for \\y|| <1. 
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Let y; be dense in and =) T—T,,. Fix y in T and in T—To, and 
let yi; — y. Then 


to.) vy(P, to) 


for every p, and 

providing ¢ is in T,,; thus, since ép is in T,,, 

= fre, toda. 


Since y; — the integrals [yiy(p, to)da are equi-absolutely continuous (see 
[31]) and 


f | to.) — vy(p, to) | d| «| 0. 
E 
Thus 


This shows that 


sup f «| -llal 
tT-To J E 


and the theorem thus follows from Theorem 34 applied to the set T—T>. 
In like manner the following theorem follows from Theorem 35: 


THEOREM 39. If in addition to the assumptions of the preceding theorem we 
assume that for each y in T 


lim ess. sup. f lve, t)| d| a| =0, 
la|(e)=0 t e 
then this limit exists uniformly for ||y|| <1. 


THEOREM 40. Let S be any bounded set in Y and y(e) an additive function 
on a(E) to Y such that y(a(E)) is separable. Then if for each y in S 


lim vyy(e) = 0, 


|a|(e)=0 


this same limit holds uniformly for y in S. 
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For if not, there exist sequences 7; and e; with | a| (e,)— 0, and a positive e 
with 
| -viv(ex) | > €, i, 2,°° 


Since y(a(Z)) is separable, there is a subsequence y/ of y; such that y/ y(e) 
converges for each e; thus by a theorem of Saks [31] the functions y/ y(e) are 
equi-absolutely continuous, which contradicts the above inequality. 


THEOREM 41. Suppose that the space a(E) when metrized by the distance 
function 


(e1, €2) | a| (ex + €1€2) 


is a separable space. Let T be a determining manifold in Y, and let T* be a set 
of points in Y (not necessarily in T) such that for every y in T there is a sequence 
n* of finite linear combinations of elements in T* with 


lim = YY, yeY. 
n 


Let y(e) be an additive function on a(E) to Y such that 

(i) for every sequence {en} of disjoint sets in a(E), y(Qven) is in the closed 
linear manifold determined by y(e,); 

(ii) lim ja} -0 ¥*y¥(e) =0, (y* € 

Then 


lim y(e) = 0. 
la|(e)=0 


Since for arbitrary y in T 


yy(e) = lim y,*y(e), 


and each y,*y(e) is continuous on the complete metric space a(Z), the func- 
tion yy(e) is in Baire’s first class and thus continuous at a point. Since it is 
additive, it must be continuous everywhere. 

Let e, be a sequence of disjoint sets in a(Z). Then (i) implies that for every 
set o of integers y(>-,.¢n) is in the closed linear manifold determined by 
y(en), (neo). Further 


= (Le), 


neo neo 


Thus it follows from Theorem 32 that y(e) is completely additive. From this 
fact it follows immediately that lim, y(e,) = y(e) if e.—e monotonically. Now 
let {e,} be a sequence dense in a(E£), and let Yo be the closed linear manifold 
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in Y determined by y(e), where e is a sum of a finite number of the sets ey. 
Then Y> is separable. Now let e be an arbitrary set in a(Z), and from a sub- 
sequence of {e,} converging in the metric of a(E) to e, pick a subsequence 
e,, such that 


(«. Il > et) = 0, 
m=1 n=m 


that is, such that 


e=|[ 


m=1n=m 


except for a null set. Now y(>-"*2e,’) is in Yo and 


m+p 
Dv n=m n=m 


so that for each m, y(>_,_,¢,) is in Vo. But 


lim Dew = Des 
m 


m=1n=m 


monotonically, and thus 


y(e) = lim y et) is in Yo. 


Hence by the preceding theorem 


lim y(e)= sup yy(e) = 0. 
(e)=0 lal(e)=0 


THEOREM 42. Suppose that the space a(E), when metrized by the distance 
function 


(1, €2) = | a| (e1 + €2 — e1€2), 


is a separable space. Let y(e) be an additive function on a(E) to Y, and let T* be 
a set in Y such that for every y in Y there is a sequence 7,* of finite linear com- 
binations of elements in T* with 


lim = YY, yeY. 
If 
lim y*y(e) = 0, 


la} (e)=0 


lim y(e) = 0. 
|a|(e)=0 
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This follows from the preceding theorem applied to !' = Y noting that in 
this case the equation 


= v9( De), veY, 


n 


where {e,} is a sequence of disjoint sets, assures us that hypothesis (i) of the 
preceding theorem is fulfilled. 


CHAPTER III 


3.0. Extension of linear functionals and integration. In this section the 
domain of a point v in X will be extended from X to ¥; that is, we shall as- 
sign a meaning to vf, where + is a linear functional on X and f is a point in 
¥=x[Y, I'| and in particular investigate the properties of the integral { fda. 


TueoreM 43. If f is in ¥[Y, T'], v is in X, and y is in T, then vyf(.) is 
linear in y and 
This is a corollary of Theorem 1. The equation 
= »7f(.) 


then defines uniquely a point 7 in T’, and we shall define vf as this point in T’. 
The notation is perhaps-faulty in that it does not show the dependence of »f 
upon the closed linear manifold ’, and perhaps v,f would be preferable. This 
latter notation wil] be used in any discussion where the manifold T is not 
fixed. It might be noted here that if f is in ¥[Y, I'’], and I’ aT, then »,-f is 
an extension of v,f. 

Besides the space ¥[Y, I’] it will sometimes be useful to consider the sub- 
class ¥o[Y, '] consisting of all f in ¥[Y, '] having the property that for every 
vin X there is a y (depending on » and f) such that 


vy¥f = YY; yer. 
THEOREM 44. The space ¥o[Y, I'] is linear, and in case T = it is closed in 
x{Y, 
To see that Xp is closed in ¥ if ! = Y, suppose that f, is in Xo, f is in ¥, and 
||fn—f||I--0, where the norm is taken in the space ¥. This means that 


sup — 0; 
hence 


lim sup sup | vyf| = 0, 


é 
2 
4 
4 
or 
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sup sup | yyn(v) — vyf| = sup yn(v) — 0, 


where y,(v) is regarded as a point in !=Y. Thus since Y is closed in Y there 
is a y(v) such that y,(v)—y(v) uniformly for ||v|] <1 and thus 
= = vy) 
for every y in I’. The proof that %, is linear will be left to the reader. 


TueoreM 45. If [=Y, then a necessary and sufficient condition for a point f 
in X|Y,T'] to be in ¥o[Y,V'] is that for every vo in a fundamental set F ¢ X there 
is @ Vo such that 


vorf(.) = (yo), 


For if vyf(.)=y(y) for every v in a fundamental set, the same identity 
holds for every v in a dense set. Thus if v is an arbitrary point in X and 
0, where 


= 


we have, using Theorem 43, 


Il ym — yal] = sup | Gm — — +0, 


so that if y=lim, y,, vyf(.) =vy. 

This proves the sufficiency of the condition, and the necessity is obvious. 

3.1. Integration of numerical functions. Before proceeding to a discus- 
sion of integration of an abstract valued function of an abstract variable it 
is necessary to set down here certain properties of real summable functions 
of an abstract variable as well as properties of the Lebesgue integral of such 
functions. 

It is well known from the works of Radon [28], Fréchet [8], Nikodym 
[22], Ridder [29], and others, that a theory of Lebesgue integration can be 
developed for real functions of an abstract variable. In fact there are numer- 
ous equivalent ways of defining the Lebesgue integral of such a function and 
several of these are discussed in the paper by Ridder. A basis for such a theory 
is usually a completely additive family a(Z) of “measurable” subsets of a 
given measurable set E and a completely additive real function a. What we 
have to say here will be based upon the postulates and results in the treatise 
of Saks ([30], pp. 247-263). Since a completely additive set function is ex- 
pressible as the difference of two completely additive non-negative set func- 
tions, one might restrict the discussion (as Saks does) to the case where a 
is monotone. We prefer not to assume this. Since only functions which are 


‘ 
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summable with respect to |a|, the total variation of a, are considered, there 
can be no meaningless symbols such as © — © arising in the discussion. 

The symbol Ls, L*(£), or L*(E, a) will be used for the space of numerical 
measurable functions ¢(p) on E for which |¢(p)|* is summable with respect 
to |a| on E; and L, L(E), or L(E, a) will be used in place of L', L(E), 
L‘(E, a), respectively. Upon introducing the norm 


ll 


the space L* becomes a Banach space.* The triangle property of this norm 
follows from Minkowski’s inequality for denumerable sums together with the 
fact that a function in ZL‘ can be approached in norm by denumerably valued 
functions. The completeness of the space can be established in a well known 
fashion. It suffices to show that a Cauchy sequence ¢, in L* determines a 
function ¢ such that ¢,(p)—> ¢(p) approximately with respect to a on E, and 
secondly that the integrals [.|¢,(p)|%d|a| are equi-absolutely continuous. 
From these two results one readily concludes that ¢ is in Z¢ and lon —|| 0. 

The general linear functional y on L*, (¢< ©), is expressible in terms of a 
point y in L” (where 1/g+1/g’ =1) by the formula 


v6 = al, 
E 
and the norm ||¥|| is given by 


1/q’ 
inl 


ess. sup. | ¥(p)|, if gq’ = 
To see this we note that by a theorem of Nikodym [22] 


a, 


where ¢, is the characteristic function of the measurable set e. Therefore 
yo=f pv p)o(p)d|a| for all finitely valued functions ¢. If ¢, is a sequence 
of finitely valued functions with ||¢,—¢||0, then ¥(p)¢.(p) —-(p)o(p) ap- 
proximately with respect to a on E, and [,(p)¢n(p)d|a| converges for every 
measurable set e. This shows that Yd is summable on E and that 
v¢=/,W(p)o(p)d|a|. Furthermore, as the reader can readily show, every 


* We assume that 1<q<~. For g=~ the space L? is the space M(E) of essentially bounded 
measurable functions. 
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function in L* is approachable in norm by finitely valued functions, and 


thus for every ¢ in L+. 
To see that y is in L*’ we proceed as follows. Since ¥¢ is in L for every 
in we have |¥(p)|*/¢ sgn in L¢ and thus 


1/q 


which shows that | is in Thus 


E 


al)” 


| a| (E) 
In general 


f | | < | | 
E 


Taking now first the case where g>1, we see that the integrand on the left 
side of the above inequality approaches |¥(p)|*’; hence by Fatou’s lemma 


|*’d| a| < 


By Hélder’s inequality ||+||*" </,,| ¥(p)|*’d|a|, so that the theorem is estab- 
lished for g>1. In case g=1 let en be the set on which | ¥(p)| 2m; then 


That is, 
(m/\|4||)"| «| (em) S| @| (EB), 


for every positive number m and every integer n. This shows that |a| (en) =0 
if m>|ly||, or in other words, 


ess. sup. | ¥(p)| 
On the other hand, it is obvious that 
ess. sup. | ¥(p)| = || lI, 


which completes the proof of the theorem. 
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From well known inequalities it follows that for every y in L®” and @ 
in L¢ the product y¢@ is in L; and if y is fixed, the function [,¥(p)6(p)d|a| 
is a linear functional on L*. Thus the conjugate space L* is isometrically 
isomorphic with L*’. To summarize the above facts about L* we state the 
following theorem: 

THEOREM 46. Assume Then the space =L*(E, a) of 
numerical measurable functions @ for which |o|* is summable with respect to 
|a| on E is a Banach space under the norm 


lol = al)” 


very linear functional y on L* is expressible in the form 


f ¥(p)6(p)d| a|,* 
E 


where is a point of L*’, (q’ =q/(q—1)), and 


1/q’ 
inl x 


ess. sup. | ¥(p)|, if d=. 


Conversely if W is a point of L*’, then Yo is summable for every o in L*, and 
¥(p)6(p)d|a| is a linear functional on Le. 

In case g= © the general linear functional on LZ? is given in terms of an 
integral with respect to an additive set function of bounded variation [7], 
[15], but we shall not use this in what follows. 

3.2. Integration of abstract functions. In case X = L*(E, a) and the func- 
tional vy in X is taken as v@=/ pda, then the meaning assigned (in §3.0) to vf 
for f in 2«(Z)[Y, T'] is to be taken as the definition of [ fda. 

Our chief interest in this chapter will be the linear space 2¢(E)[Y, TJ, its 
specializations 2¢(E)|Y, &*(E)o[Y, VY], and Y]=(E),[Y, VY], 
and the integral as a linear operator on these spaces. Here and elsewhere un- 
less explicitly stated to the contrary g is an arbitrary real number with 
1<q<~. The space &(E)o[Y, ¥]=@'(E)o[Y, includes the various classes 
of functions called summable or integrable by the authors Graves [10], 
Hildebrandt [14], Bochner [4], Dunford [6], Birkhoff [3], and, in view of 
Theorem 45, is identical with the class of summable functions discussed by 


* It is also expressible in the form y@= / »¢(p)o(p)da. 
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Pettis [24]. The space &(E)[Y, Y] is the space of summable functionals de- 
fined recently by Gelfand [9] for the case E=(0, 1). 

We shall begin with a discussion of the general space &*(Z)[Y, '] and 
the integral on this class. The discussion can then be applied to all of the 
above cases. A number of the properties of the space 2¢(E)[Y, I'], and of the 
integral /,f(p)da on this space, are immediate consequences of the definitions 
and of Theorem 1. We shall state a few of them first. 

THEOREM 47. The space 2*(E)[Y, I] is a normed linear space, the norm 
being that of Theorem 1. The integral { ,f(p)da is a linear (that is, additive and 
continuous) operation on 2*(E)|Y,T] toT. 

THEOREM 48. (i) If f is in T'], then f is in &2(e)[Y, for every 
e in a(E), the family of measurable subsets of E. 

(ii) The function f on E to Y is in 2(E)|Y, T] if and only if o-f is in 
Q(E)[Y, for every in L*’(E). 

THEOREM 49. If y isin T, in L*’(E), and f in 2¢(E)|Y, 1], then 


| f vo(p)f(p)der| < 


A corollary of this is the following theorem: 


THEOREM 50. If 1<q< the integral 


f $(p)f(p)da 


is a completely additive and absolutely continuous function on a(E) for every f in 
and in L*’(E). 

Here and elsewhere in the paper a set function y(e) is called absolutely 
continuous in case y(e)->0 with |a|(e), and completely additive in case 
(Donen) =>_n¥(€n) for every sequence {e,} of disjoint measurable sets. In the 
preceding equality the series on the right must be unconditionally convergent 
since the left side is independent of the order of the sequence {e,}. 


THEOREM 51. If Y is separable and q>1, then 


For if yny— vy for every y in Y, then ||7,]| is a bounded sequence, and, by 
the preceding theorem, we have for every e in a(£) 


lim = 0 


lal(e’)=0 ee 


uniformly in Thus 


i 
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f f 


By a theorem of Banach ([(1], p. 131, Theorem 8) there is a y such that 


vy = f Woda, vyeY. 


The desired conclusion follows from Theorem 45. 

Theorem 51 contains a recent theorem of Krein [17]. Krein has proved 
that if f(¢) on (0, 1) to a Banach space Y has the property that yf(t) is con- 
tinuous for every y in Y, then there is a point y in Y such that 


1 
w= f veY, 
0 


and has used this result in a discussion of a fixed point theorem. In order 
to see more clearly the connection between Krein’s result and Theorem 51, we 
should state this theorem in a slightly different form. If we do not want all 
of |Y, to be in but merely want a particular function f 
in 2*(£)[Y, Y] to be in 2*(£).[Y, VY], it is sufficient to assume that f(E—a 
null set) is separable (and is thus not necessary to assume the whole of Y to 
be separable), which is’ equivalent [24] to assuming that f is measurable. 
Theorem 51 stated in this new form reads as follows:* 


THEOREM 52. Assume q>1, and let f on E to Y be measurable and such that 
vf is in L*(E) for every y in Y. Then f is in 2*(E)o[Y, VY], which means that 
for every @ in L*(E) there is a yg such that 


= f yeY. 
E 


To obtain Krein’s result from this it suffices to prove that f[(0, 1) ] is sepa- 
rable. Let 7, be a partitioning of (0, 1) into intervals 6,”, (m=1, 2,---, pa), 
such that the norm of the partitioning approaches zero with 1/n. If 7," is a 
point of and f,(¢)=f(7,") on 6,”, then yf,(t)—> yf(t) for every and 
and thus f(#) must be in the separable closed linear manifold determined by 
the points f(r,”"). 

In the case considered by Krein more might be said, for the boundedness 
of yf(t) for every y in Y insures us that the norms ||f(#)|| are bounded, and 
thus the function f is an absolutely integrable and measurable function. 


* Pettis [24] has given a slightly different form of this theorem. 


= 
a 
] 
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THeEorEM 53. If f is in Q(E)[Y, 1], and the set of values taken by the 
integral 


f eea(E), 


is a separable set in T, then the integral is a completely additive and absolutely 
continuous function on a(E). 


Since the unit sphere in I is a bounded set in I’, this theorem is a corollary 
of Theorem 40. 


THEOREM 54. Suppose a(E) separable,* f in 2(E)o[Y, I], where T is a 
determining manifold in Y, and 

(i) for any sequence {en} of disjoint sets, [x-,f(p)da is in the closed linear 
manifold determined by [.,.f(p)da. 

Then the integral [.f(p)da is a completely additive and absolutely continuous 
function on a(E). Further if T =Y, the hypothesis (i) may be omitted. 


This is a corollary of Theorems 41 and 42. It is not as general as these 
theorems for there are absolutely continuous set functions which are not in- 
definite integrals [24]. 


THEOREM 55. Let E be the real interval (a, b) and a Lebesgue measure, and 
let f be in 20(E)[Y, and in L(E). Then if 


7) = f = 
we have 
fe 


and 
b 


f &(t)f(0)dt. 


b 
f = 07 
Let a(Z’) be another o-field of measurable subsets of a measurable set E’, 
and let a’ be a completely additive set function on a(EZ’). 


THEOREM 56. If the function f(p, p’) on EE’ to Y belongs to the spacet 
Q(EE’, aXa’)[Y, 0], and if for all p in E, except for a set upon which |a| =0, 
f(b, p’) is in a’) then fy-f(p, p’)da’ is in a) [T, and 


* For the case '=Y Pettis proves this theorem without the assumption of separability. His 
method however can be applied to the theorem as stated here without the assumption of separability. 

t For the notion of the product measure on the product space and the general Fubini theorem 
for real functions see Saks [30], or Lomnicki and Ulam [20]. 


i 
| 
| 
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poate x at) = f ada pe. 


THEOREM 57. Let f, be in %(E)[Y, T], (n=1, 2,---), and suppose 
SJ fn(p)da is absolutely continuous for each n. Then if 


tim f 


exists for every e in a(E), the integrals { .f,(p)da are equi-absolutely continuous. 


The familiar argument of Saks [31] based on the Baire category theorem 
holds in this environment. For upon setting F,,(e) = J, f,(p)da, the space a(E) 
with metric (e:, e2) =|a| (e:+e:—e1e2) is the sum of the closed sets 


a, = a(E)[l|F.n(e) — Fn(e)|| < €/3, m,n = 


and thus by the Baire theorem, there is an integer go and a sphere S(éo, 7) € a,,. 
Let 6<r be such that 


< €/3, | «| (e) <6. 
Thus for | a| (e) <6 the sets e:=e+(eo—e), e2=e0—e are in r), and 
||Fn(e) — ||Fm(ex) — + ||Fm(e2) — Fa(es)]| 2¢/3, 
for m, n=qo; and for | a| (e) <6 we have 
«, m = qo. 


THEOREM 58. Let f, be in Q(E)|Y,T], (n=1, 2, --- ), let ap- 
proximately on E, and let { f.(p)da be absolutely continuous for each n. Then the 
following assertions are equivalent: 

(i) The limit, lim {,fn(p)da, exists on a(E). 
(ii) The function f is in Q(E)|Y, T] and 


lim f = ff s(p)da 
uniformly on a(E). 


(iii) lim lim sup ff ful)de = 0. 
|a|(e)=0 m e 
(iv) lim IS m(p)da = © 
ja|(e)=0 e 


uniformly in m. 


= 
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The proof of the theorem is not a great deal different than for the case of 
absolutely integrable and measurable functions f(p) (see Dunford [6], pp. 
447-448). There are several differences however; for example, in the present 
case we do not know that the set e(, €) consisting of all the points of e where 
\fn(p) —f(p)|| 2 is a measurable set. It might also be pointed out that in the 
case of absolutely integrable functions it is necessary to assume that the limit 
function is summable, while in the broader space ¢(£)[Y, I'] this is part of 
the conclusion. Before proving the theorem we shall recall the meaning of 
approximate convergence. The sequence f,(p) is said to approach f(p) ap- 
proximately on £ in case for every m and e>O there is a measurable set 
e’(n, €) >e(n, €) such that 


lim | a| (e’(n, €)) = 0. 


If ||y|] =1, and e(y, 1, €) is that part of e on which 
| — 2 «, 


then ”, €) is measurable and 
e(y, nN, €) e(n, €) e’(n, €) 


which shows that yf,.(p)— yf(p) in measure. The sequence f,(p) is said to ap- 
proach f(p) almost uniformly on £ if for every «>0 there are sets E, and E? 
with measurable, > E—E,, |a|(E/)<e, and f,(p)—f(p) uniformly 
on E,. It is known [6] that if f,(p)—-/(p) approximately on E, then every 
subsequence of f,(p) contains a subsequence which approaches f(p) almost 
uniformly on E. Now to demonstrate the theorem we shall show the following 
implications: (iii) (i) (ii) (iii). Assuming (iii), we have for every 
e>0 a 5>0 such that for every measurable e with |a|(e) <6 there is an , 


such that 
| f 


Let it first be supposed that f,(p) converges almost uniformly on EZ. Then 


<€, = Ne. 


| da 


+ 


where E’(5) > E—E(8), |a|(E’(5)) <6, and f,(p) converges uniformly on 
E(6) and hence on E—E’(é). There is an m, such that 
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| f (fm(p) — fn(p))da || <«, 
E-E’ (5) 
| < 26, m,n m. 
E’ (6) 
Thus for m, n=>m, we have 
| € 


The same proof holds for an arbitrary e in a(£). The same conclusion follows 
if f, only approaches f approximately, since then every subsequence of 
Jfn(p)da contains a subsequence approaching f(p)da. Thus (iii)—(i). The 
implication (i)—(iv) follows from the preceding theorem. To show that 
(iv)—(ii), first note that since lima) (.)-0 J, ¥fm(p)da=0 uniformly in m, and 
Yfm—~f approximately on E, the function yf must be in L(E); hence f is 
in &(£)[Y, Furthermore for each y we have 


(a) lim = f 


To complete the proof of (ii) we have to show that (a) holds uniformly for 
l|y|| <1 and e in a(£). Now 


Gale) = 


where E’(m, n, €) is a measurable set covering the set E(m, n, €) upon which 
\|fm(P) —fn(p)|| >«. Also 


| 


= sup | f — falp) dex | < e| (E). 
e—E’(m,n,e) 
There is a 5>0 such that if | a| (e) <6, 


and an my such that 


| 
4 
| 
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| «| (E’(m, n, €)) <6, n,m = mo. 


Consequently, since 


| a| (eE'(m, n,€)) S | a| (E’(m, n, €)) < 4, n,m 


| ff f 


and, in view of (a), this proves (ii). From (ii) it follows that { f(p)da is ab- 
solutely continuous. Thus 
f 


| J 


hence (ii) implies (iii), which completes the proof of the theorem. 


THEOREM 59. If f is in 2¢(E)[Y, Y], and if for every e in a(E) there is a 
y in Y such that 


we have 


2, m= mo; 


= lim = 0; 


la|(e)=0 


lim lim sup 
|a|(e)=0 m 


J veY, 
then for every o in L*'(E) there is a y in V such that 
vy = f 
This is a corollary of Theorem 45 in the case where 1<qg<o. The case 


q =~ is the case where yf(p) is measurable and |f(p)| is essentially bounded 
for every y in Y. The integral 


f 
E 
which we know always exists in Y, must be in Y, for if ¢, is a sequence of 


finitely valued functions (that is, functions assuming only a finite number of 
values) with 


— = f | — 4(p)| d| «| 0, 
E 
then /,¢n(p)f(p)da is in Y, and by Theorem 49, 


| J J 


so that /,6(p)f(p)da is also in Y. 


| 

| 

3 

4 

| 

a 
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THEOREM 60. For f in 2(E)o[Y, Y] (2(E)[Y, V]), the operation U(y) 
on [Y] to L(E) is weakly continuous. If f is in Q(E)o[Y, ¥], and 
f(E—a null set) is a separable set in Y (or if f is measurable*), then U is com- 
pletely continuous. 


For if y:y—vy for every y, the integrals 
J vs onda = v0 
are equi-absolutely continuous, and since 7,f(p)—>f(p) for each p, we have 


A similar proof holds for the case where f is in &(Z)[Y, VY]. The complete 
continuity of U is a corollary of Theorem 34. 


THeEoreEM 61. f is in 2¢(E)o[Y, Y] and is in L*(E), then a necessary 
and sufficient condition that there exist a y in Y with 


(i) lly] -vf(p) = 


almost everywhere on E, is that 


(i) | J = ar | f n(p)f(p)de 


for every n in a set dense in L*(E). 


If there does exist a y satisfying (i), then (ii) is obviously true. To prove 
the converset first note that if (ii) holds for every 7 in a set dense in L*’, 
then by Theorem 49 it holds for every 7 in L*’. Let Yo be the linear manifold 
in Y determined by 


J ne Lt’. 


Now if 


y= J J 


we have 


* That is, the limit almost everywhere of a sequence of finitely valued functions. In view of a 
result of Pettis [24] and the fact that yf is measurable, the assumption that f(Z—a null set) is separa- 
ble is equivalent to the assumption that f is measurable. 

+ The proof parallels Banach’s discussion of the moment problem ([1], p. 55, Theorem 4). 


j 
| 
i 
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=0, 


| J J <M [n(p) — 


so that 
YAY) = n(p)o(p)da 


is a well defined additive functional on Yo. It is also continuous since 


= 


Hence by the Hahn-Banach theorem on the extension of linear functionals 
there is a linear functional y defined on the whole of Y with 


lyy| yeY, 
= yoy = da, Yo. 
ye 
Thus for every 7 in L*’(E) 
da = da = da. 
Upon writing ¥(p) =(p) —7f(p), we have 
J = 0, 


According to a theorem of Hahn-Sierpifski ({30], p. 249) every E’ in a(E£) 
is representable as the sum of two disjoint sets Z,’, E’ such that a is non- 
negative on a(EZ,’) and non-positive on a(E_’). It follows immediately that 
¥(p) =0 almost everywhere. 

THEOREM 62. Let f, be in &(E)o[Y, Y], (n=1, 2,---), and f,(p)—f(p) 
approximately on E. Then the following assertions are equivalent: 

(i) The limit, limn J, fm(p)da, exists on a(E). 
(ii) The function f is in Q(E)o[V, Y] and 


tim f = f 


uniformly on a(E). 


= 0. 


(iii) lim lim sup 
la] (e)=0 m 


J 


j 

3 E 

| 

3 

|e 
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(iv) Uniformly with respect to n, 


lim | f,(p)da = 0. 


lal(e)=0 Ye 


In view of Theorems 54 and 58 there is only one thing to prove here; 
namely, that f is in &(£),[Y, Y]. This however follows immediately from 
Theorem 59 together with the facts that Y is closed in Y and 


lim = f 


THEOREM 63. Let E be the real interval (a, b), a Lebesgue measure, real 
and continuous on (a, b), and f in 2(E)o[Y, VY] or Q(E)[Y, V]. Then if 


alu) = 
the Riemann-Stieltjes integral SJ? o(u)dB (u) exists and 
b b 


That fi $(u)d8(u) exists in the Riemann-Stieltjes sense follows from Theo- 
rems 11, 12, and 13; and the equality follows from the corresponding Theorem 
for numerical functions. 


THEOREM 64. Let E, a be as in the preceding theorem, and let f be in 
R(E)o[Y, YJ. If Y is separable, then for every real function ® absolutely con- 
tinuous on aS<t<b we have almost everywhere, 


= #9) f sou. 


This is not an immediate corollary of Theorem 55 as one might think at 
first sight. By that theorem F is in 2(£)[Y, Y], where 


F(p) = f 


and 


"F(p)dp = &(s) f(t)dt — f 6(s)f(s)ds. 


Since f is in 2(E)o[Y, Y], it follows that F is in Q(Z)[Y, Y]. This is still not 


| 

| 

i 

4 


| 
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enough however, for an indefinite integral of a function even in 2(E)o[Y, Y] is 
not necessarily differentiable. It will be true that 


d 8 
F(p)dp = F(s) 


almost everywhere, (which is what we want to show) if F(p) is measurable 
and ||F(p)|| is summable, that is, if F belongs to the class of absolutely in- 
tegrable and measurable functions (see Bochner [4]). By Theorem 2, f*f(t)dt 
is bounded on (a, 6), and for each y in Y the function y /ot(¢)dt is measurable 
(in fact absolutely continuous) on (a, b). Since Y is separable, it follows from 
a theorem of Pettis [24] that /”f(#)dt is measurable. Thus F(p) is the product 
of a real summable function and a bounded measurable function and hence is 
measurable and absolutely integrable. 


THEOREM 65. Let E, a, f, Y be as in the preceding theorem.* If f is defined 
to be constant outside of (a, b) and f,(t) =f(t +h), then =0, that is, 


lim f a =0 
h=0 a 
uniformly for ||y|| <1. 
This follows immediately from Theorem 34 by taking the class T as a 
class consisting of a single element. 


CHAPTER IV 


4.0. Instances. It is not the purpose of the present chapter to apply the 
preceding results, as we intend to do that later, but merely to call attention 
to some special instances of a few of the theorems. 

As an instance of Theorem 10 we mention the following theorem which is 
a generalization of a theorem of Hahn-Steinhaus [12], given by Saks and 
Tamarkin [32]. 

THEOREM 66. Let V be an arbitrary set (v) and U a set (u) in which there 
is a notion of null set satisfying condition (N). If for every u in U (or almost 
all u in U) and every v in V the function H(t, u, v) is of bounded variation on 
a<t<b and normalized so that} 


H(t, u,v) = 1/2[H(t + 0, u, v) + A(t — 0, u, »)], 


* In both of these theorems it is not entirely necessary that Y be separable. All that is needed 
is that f(E—a null set) be separable which, in view of Pettis’ result, is equivalent to saying that f is 
measurable. 

+ Or normalized in any other manner so that its total variation is the same as the norm of the 
linear functional it defines on C. 


| 
| 
| 
| 
4 
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then if, for every continuous function > defined on (a, b), there is a null sei Us, cU 
and an M4, such that 


b 
f o(t)d H(t, u,v) < Mg, ue U — Ug, veV, 
there will be a null set Up ¢ U and a constant M such that 
b 
f | dH (t, u, v)| < M, ueU — Uo, veV. 


This corresponds to taking Y=BV, !'=C. Another choice might be 
Y=M, T=L. This would yield the theorem: 


THEOREM 67. Let U and V be as in the preceding theorem, and H(t, u, v) 
essentially bounded in t for almost all u and all v. Then if for every summable 
function @ there is a null set U,c¢ U and a constant M, such that 


b 
f o(t)H(t, u, v)dt < M, ueU — Uy, veV, 


there is a null set Uy) ¢ U and a constant M such that 


ess. sup. | H(t, u, 2) | <M, ueU — Uo, veV. 
t 


THEOREM 68. Let a{E), a(E’) be two families consisting of all measurable 
subsets of the measurable sets E and E’, respectively. Let a, a’ be positive, finile, 
completely additive set functions on a(E), a(E’), respectively. Let K(p, p’) be a 
real function on EE’ such that the integral 


f p’)da’, ea(E’), 


exists for almost all (with respect to a) p in E and is essentially bounded and 
measurable on E. If for every @ in L(E, a) there is a constant M. such that 


then for every pW in L(E’, a’) the integral 


Myo'(e’), 


f 


exists for almost all p, is essentially bounded and measurable in p, and 


lim ess. sup. K(p, p’)¥(p’)da’ | = 0. 


} 
| 


1938] LINEAR SPACES 


In Theorem 18 take Z=L(E’, a’), Y=L*(E, a), T=L(E, a), and 
se) K(p, 
E’ 


By assumption f(z) is defined for all finitely valued functions, so that D(f) 
is dense in L(E’, a’). Also by assumption 
| | < 


providing z is the characteristic function of a measurable set e’. It is readily 
shown that this same inequality holds for any finitely valued z, and the adjoint 
of f is thus defined for every y in T. Then by Theorem 18, f(z) can be extended 
to be linear and bounded on L(E’, a’). Let z, be finitely valued functions 
approaching z in L(E’, a’). Then there exists a set Eye a(E) with a(E— Ey) =0 
such that 


converges uniformly with respect to p in Ey. Furthermore for each p in Eo 
K(p, p’)2n(p’) > K(p, p’)2(0) 


approximately with respect to a’ on E’, and 


ess. sup. J K(p, | = J | 2n(p") | de’, 

D e’ 
where y,, is the characteristic function of e’. This shows that the integrals 
J K(p, p’)en(p’)da’ are, at least for p in a set of measure a(E), equi-abso- 
lutely continuous, and thus /,K(p, p’)2(p’)da’ exists almost everywhere and 
is essentially bounded. The last conclusion follows from Theorem 18. 


THEOREM 69. Let Z be a Banach space and f(z) = K(z, t) an additive func- 
tion on Z to L,(0, 1). If 


lim K(z, t)dt = 0, Oss 1, 
z=0 0 
then 


lim ess. sup. | K(z, t) | =0, ifg=, 
z=0 t 


1 1/q 
tim ( f =0, ifigq<o. 
0 


In case 1<qgS~, this is Theorem 21 applied to L’(0, 1). For g=1 one 
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can use Theorem 20 with Y =L, !=C =continuous functions, ['* =charac- 
teristic functions of intervals. 


THEOREM 70. Let Z be a Banach space, and f(z) = K(z, t) an additive func- 
tion on Z to the space C (where C is the space of continuous functions on (0, 1)). 
If 


lim K(z, #) = 0, 0<t<1, 
z=0 


then 
1 
lim f | d.K(sz, t)| = 0. 
z=0 0 


This is Theorem 20 applied to the case where Y = C, !'=C, I'* =charac- 


teristic functions of intervals. 
Assume p>1; then an instance of Theorem 34 (or 60), if we use the fact 


that if Y=Y, then @(Z)[Y, Y]=2(£).[Y, Y], is the following theorem: 
THEOREM 71. If for almost all s in (0, 1) the function H(s, t) is in L®’(0, 1), 
and if 


UE) = V0) = HG, Dewar 


is in L(O, 1) for every @ in L”(0, 1), then the operation U(@) is completely con- 
tinuous on L” to L. 
Likewise we have the following theorem: 


THEOREM 72. If for almost all s in (0,1) the sequence a,(s) is in 1?’ and 


U(x) = Enan(s) = 


n=1 
is in L (0, 1) for every vector x= {£,} in 1”, then the operation U(x) on l* to L 
is completely continuous. 


As instances of Theorem 32 we have, if T is a class with one element and 
we take first Y =/?’, [=]? and then Y = L”’, '=Z?, the following two theo- 


rems: 
THEOREM 73. Every linear operator from 1”, (p>1), to l is completely con- 
tinuous. 
THEOREM 74. Let a,(s) be in L”’(0, 1), (n=1, 2,---), and suppose that 


U() = f an(s)4(s)ds 


is in| for every in Then is completely continuous. 


ash. 


+ 
ad 
| 
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As a matter of fact any continuous linear operator from Y to 1 is completely 
continuous provided Y =Y. This fact has been proved by Pettis [25] and fol- 
lows also from Theorem 30 together with the remarks following Theorem 32. 


TueoreM 75. If {y,} is a sequence of points in a Banach space V such that 
EnVn converges for every x={£,} in m, then the operation 


U(x) = Zz En Yn 


n=1 
on m to Y is completely continuous.* 

By hypothesis all partial sums of >>,,y, are convergent; hence by a result 
of Orlicz [23] the series >°,,y, is unconditionally convergent. The operation 
V (vy) ={vy,} is therefore a linear operation on Y to /. By Theorem 32, V is 
completely continuous, and since U is the adjoint of V, it follows from a re- 
sult of Schauder [33] that U is completely continuous also. 

If we had been working in the complex domain instead of the real domain, 
the following theorem would be an immediate corollary of the definition of 
the integral. As it is, a few words of explanation are necessary. In this final 
theorem Y is a complex Banach space, that is, a complete normed vector 
space satisfying all the postulates for a Banach space with the real number 
system replaced by the complex number system (see Wiener [38]). Let Y be 
the space of alt complex valued continuous linear functionals on Y. Then upon 
placing 

= sup | vy| 


for y in Y, the space Y is also a complex Banach space. 
The space Y may be regarded as a Banach space; thus ([1], p. 55) for 
each yp there is a real function uw such that 


u(yo) = sup | = 1, 


u(ciyi + Cove) = + cou(ye), C1, C2 real. 


The function y is not in Y however because ucy¥ cuy for c complex. However 
by defining u’(y) = —u(iy), the function 


vy = [1/(2)"?][u(y) + in’(y)] 


is continuous and linear, and ycy =cyy for complex values of c. Thus ¥ is in Y. 
It is readily shown that 


lvl] $1, = |] 


* This theorem was proved for the case Y =1 by Littlewood [19]. 
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so that 


A determining manifold T ¢ Y is defined to be a closed linear manifold in Y 
such that 


sup | vy| yeY, 


where y¥ is restricted to be in [ and M is some positive constant independent 
of y. 

It is readily shown that a function y(¢) on an arbitrary range T is bounded 
if yy(t) is bounded for every y in a determining manifold ['¢ Y. For then [ 
is the sum of the closed sets 


Tr, vy@| Sm, te T], 


and by the Baire category theorem there is a sphere S(7o, 7) of T contained 
in I’, for some integer mo. Thus if ||| <r, 
so that, for any yinI with y <1, | yy(2)| <2nor on T and thus 
2mor/M, 
In terms of this notation we have the following theorem: 
THEOREM 76. Suppose D is a simply connected open set in the complex plane, 
Y is an arbitrary complex Banach space, and f(z) on D to Y is such that yf(z) 
is analytic for every y in T. Then yf(z) is analytic for every y in Y and* 
(i) {,f(z)dz=0 for any rectifiable curve C in D; 
(ii) f(z) —2) if C contains z; 
(iii) f(s) has strong derivatives of all orders and 


(iv) f(z) =(n!/2mi) if C contains z; 
(v) the Taylor expansion 
= 
> —— fr@) 
n=0 nN. 
converges uniformly to f(z) for z in any circle |z—¢| <r inside D. 
All of the above integrals can be taken in the Riemann sense. 


To prove (ii) note first that the boundedness of yf(z) on C for every 7 
in T' implies the boundedness of || f(z)|| on C. Now from the formula 


_ * If the boundary of D is a rectifiable Jordan curve, and f(z) is continuous in the closed domain 
D for each y in T, then the curve C, appearing in the various integrals, may be taken as the bounding 
curve. 
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and the boundedness of ||f(z)|| on C it follows that 


lim y[f(e + f(@)] = 0 


uniformly for ||y|| <1, that is, lim,.o f(+/) =f(z). Thus the integral 


0 
(2) = — 


exists in the Riemann sense. Since f(z) =vg(z) for every y in a determining 
manifold, we have 


The remaining conclusions of the theorem follow in the usual manner from 
the above formula. 
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GROUPS WITH ABELIAN CENTRAL 
QUOTIENT GROUP} 


BY 
REINHOLD BAER 


In recent years much progress has been made in the study of groups whose 
central quotient groups are abelian.{ Such a group is an extension of an 
abelian group by an abelian group, and the usual method of solving the im- 
plied extension problem may be described as follows: If G is such a group, 
then a maximum abelian subgroup V of G is chosen. V contains the central 
of G, and G/V represents therefore exactly an abelian group of automor- 
phisms of the abelian group V. Thus it is possible to apply the results of the 
theory of automorphisms of abelian groups. This method is rather powerful 
and yields very interesting results. On the other hand it is not restricted to 
this class of groups and may in fact be applied to all groups with abelian 
commutator groups.§ Finally it has to be mentioned that this method is not 
an invariant one, since the maximum abelian subgroups are in no sense 
uniquely determined. 

In this paper another method will be indicated. If G is a group whose 
central quotient group is abelian, then preference is given to a subgroup S 
which is situated between the central and the commutator group of G. This 
subgroup S will be left indeterminate as long as possible. But as soon as the 
final results are reached, either the central or the commutator group of G will 
take the place of S according to which of these choices will give better results. 

The extension problem presents itself now in the following form: To char- 
acterize those groups whose central contains a given abelian group S and 
whose quotient group (mod S) is isomorphic to a preassigned abelian group 
G*. Each group with these properties induces certain invariant relations be- 
tween the given abelian groups G* and S, and these invariants turn out to 
be characteristic invariants, provided G* is a direct product of (a finite or 
infinite number of finite and infinite) cyclic groups. This last hypothesis will 
be the only restriction of generality imposed on the investigated groups, so 
that the. finite groups with abelian central quotient groups are included in 
our treatment. 

f Presented to the Society, September 7, 1937; received by the editors September 20, 1937. 

t Cf., for example, H. R. Brahana, American Journal of Mathematics, vol. 57 (1935), pp. 645- 
667, and Duke Mathematical Journal, vol. 1 (1935), pp. 185-197; also C. Hopkins, these Transac- 


tions, vol. 37 (1935), pp. 161-195; vol. 41 (1937), pp. 287-313. 
§ K. Taketa, Japanese Journal of Mathematics, vol. 13 (1937), pp. 129-232. 
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1. Basic concepts and formulas. If G is any group, then Z(G) denotes 
the central, and C(G) the commutator group of G. If m is any positive integer, 
then G, is generated by the elements x which satisfy x*=1, and G” is gen- 
erated by the elements of the form x* for x in G. (If G is an abelian group, 
then G,, consists of the elements x with x" =1, and G* consists of the elements 
x” for x in G.) Finally let (x, y) =xya-ty". 

(1.1) If G and S are groups such that C(G) SS <Z(G), then the operation 
(S <G; x*, y*) =x* 0 y*, which is defined by Sx o Sy =(x, y), obeys the following 
rules: 

(1) 2* o y* is, for every x*, y* in G*=G/S, a uniquely determined element 
in S. 

(2) 1=2* ox* =(x* 0 y*)(y* 02%). 

(3) o = (x* 0 y*)(x* 0 2*). 

(4) The element z in G belongs to Z(G) if, and only if, Sz o x* =1 for every x* 
in G*. 

Proof. The statements (1), (2), and (4) are obvious. (3) may be proved 
as follows: 


Sxo SySz2 = xyzx zy = 
(x, y)(x, 2), 
since every commutator is an element of the central. 
An obvious consequence of assertion (1.1), (4) is the following proposition: 
(1.2) Suppose that C(G) S$S<Z(G). Then S=Z(G) if, and only if, 1 is the 
only element w* in G/S such that w* o x* =1 for every element x* in G/S. 
(1.3) If x and y are elements of the group G with abelian central quotient 
group, then 
(xy)! = (y, 2) 
for every positive integer i. 
Proof. This statement is certainly true for i=1. If 1 <7, then 
= (y, x) 
since the commutators are elements of the central, and since (2) and (3) of 
(1.1) may be applied. If (1.3) holds true for i—1, then 


(xy)i = (y, = (y, 


and thus (1.3) holds true for every 7. 
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(1.4) Suppose that C(G) <S<Z(G) and that x and y are elements of G such 
that Sx and Sy are both contained in (G/S) ». 

(a) If at least one of the elements Sx and Sy is contained in (G/S)? (this 
would certainly be the case, if n is odd), then (xy)"=x"y". 

(b) If n=2m is an even integer, then 


(xy)?™ = (SxoSy)™x?™y?™ and ((SxoSy)™)? = 1. 
Proof. It follows from (1.3) that 
(xy)™ = (Sy Sx) 


and the statement (b) is now a consequence of (1.1). In order to prove (a), 
assume that Sy belongs to (G.S)*. Then Sy = Sz? for some z in G and 

(Sy o = (Szo Sx")! = 
by (1.1). This completes the proof. 

(1.5) If CG) S$SSZ(G), and if the function P(n, x*) =P(S <G; n, x*) is 
defined by P(n, Sx) =S"x", then the following statements are true: 

(1) P(n, x*) is, for every x* in (G/S)n, a uniquely determined element in 
S/S". 

(2’) If x* and y* are both elements of (G/S),, and at least one of them is an 
element of (G/S)? (for example, if n is odd), then P(n, x*y*) = P(n, x*)P(n, y*). 

(2’") If x* and y* are both elements of (G/S)om, then 

P(2m, x*y*) = («* o y*)™P(2m, x*)P(2m, y*), 
and (x* 0 y*)™ is either the unit or an element of order 2. 

(3) P(nm, x*) =P(n, x*)™ for positive m and x* in (G/S)n; and P(n, x*™) 
= S"P(nm, x*) for x* in (G/S) am: 

Proof. (1), (2’), and (2’’) are consequences of (1.4). If Sx belongs to 
(G/S), and m is a positive integer, then Sx belongs to (G/S) nm and P(n, Sx)™ 
= (S"x")™ = Sx" = P(nm, Sx). If Sx belongs to (G/S) nm, then Sx™ belongs 
to (G/S),n, and P(n, =Sx™ =S$"P(nm, Sx). 

2. Existence theorems. We prove the following theorem: 

THEOREM 2.1. Suppose that S and G* are abelian groups, that x* o y* is, for 
every x* and y* in G*, an element in S, and that P(n, x*) is an element in S/S* 

for every positive integer n (if n is such that elements of order n exist in G*) and 
for every x* in G,*. 

There exists a group G such that C(G) $S<Z(G) and an isomorphism y of 
G/S upon the whole group G* such that (Sx)? o (Sy)1 = (x, y) for x and y in G and 
P(n, (Sx)”) =S*x" for Sx in (G/S)n (provided there exist elements of order n 
in G*) if, and only if, the following conditions are satisfied: 
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(1) 1=2* =(x* 0 y*)(y* 0 x*), 0 = (x* 0 y*) (x* 02%). 

(2) P(n, x*y*)=P(n, x*)P(n, y*), if the n is odd; and P(2m, x*y*) 
= (x* o y*)™P(2m, x*) P(2m, y*). 

(3) P(nm, %n*) = P(n, x*)™, if x* is an element of G,*. P(n, x*) = S"P(nm, x*), 
if x* is an element of Ginn. 

Proof. The necessity of the conditions (1) to (3) is a consequence of (1.1) 
and (1.5). Suppose now that conditions (1) to (3) are satisfied. There exist 
an ordinal number ¢ and, for every ordinal number v with 0<v St, a group 
G*(v) with the following properties: G*(0)=1, G*(v) <G*(v’) for v<v’, 
G*(t) =G*; G*(v+1)/G*(v) is a cyclic group whose order is either infinite or a 
prime number; and if 7 is a limit ordinal, then G*(v) is the join of the groups 
G*(u) for u<v. 

By complete (transfinite) induction with regard to v, groups G(v) and iso- 
morphisms (z) will be defined which obey the following rules: 

(i) C(GQ@)) 

(ii) G(v) <G(v’) for O<v<0’ St. 

(iii) y(v) is an isomorphism of G(v)/S upon the whole group G*(v). 

(iv) If xis an element in G(v) and v <v’, then (Sx)7™ = (Sx)1”, 

(v) If xand y are elements in G(v), then (x, y) = (Sx)™™ o (Sy)™. 

(vi) If G* contains elements of the finite order n, and if Sx belongs to 
(G(v)/S)n, then = P(n, (Sx)v). 

The choices G(0) =.S and y(0) =1 are in accordance with these rules, since 
G*(0) =1. It may therefore be assumed that for every u<v a group G(u) and 
an isomorphism y(u) have been defined which satisfy (i) to (vi). 

Case 1. v=w-+1 is not a limit ordinal. G*(w+1)/G*(w) is a cyclic group. 
It is either infinite or its order is a prime number p. Denote by b* an element 
of G*(v) generating G*(v) (mod G*(w)). If G*(w)b* contains elements of finite 
order, then b* may be chosen as an element of minimum order in its class 
G*(w)b*. If G*(w+1)/G*(w) is of finite order p, then let c be an element in 
G(w) such that (Sc)™™ =b*?, 

In order to define G(v) and y(v) we require first the following results: 

(2.11) An automorphism B of G(w) is defined by x® =(b* o (Sx)™™)x. All 
the elements of S are fixed elements of 8. If G*(v)/G*(w) is of finite order p, then 
cis a fixed element of B and the automorphism y =B? is the inner automorphism 
of G(w) induced by c. 

By its definition 2 is, for every x in G(w), a uniquely determined element 
of G(w). If xis an element in S, then Sx =S and x* =x by (1). If xand yare ele- 
ments of G(w), then (xy)* = (b* o (Sxy)™™) ay = (b* o (Sx)™) (b* o (Sy)™™) xy 

= x4y8 by (1) and (i). If x*=1, then x is an element of S, consequently x=1. 
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From these facts it follows that 8 is a (proper) automorphism of G(w). 

Suppose now that G*(v)/G*(w) is of finite order p. Then c* = (b* 0 (Sc)™™)c 
= (b* o b*)c =c by (1). Since the elements of S are fixed elements of the auto- 
morphism £, 


a¥ = (b* (Sx)7™)) = (b*? o (Sx)™™) x 
by (1), and 
x¥ = ((Sc)™ o (Sx)™™) = cxcm! 


by (v). This completes the proof of (2.11). 


(2.12) There exists a group G(v) with the following properties: 

(a) G(w) is a normal subgroup of G(v), and G(v)/G(w) is a cyclic group of 
the same order as G*(v)/G*(w). 

(b) G(v) contains an element b which generates G(v) (mod G(w)) and satis- 
fies the relation bxb-! = x for every x in G(w). 

(c) If G*(v)/G*(w) is of finite order p, then b” is an element of G(w), and 
Sc=Sb?. 

(d) If b* is an element of finite order pn, then S?"b?" = P(pn, b*). 

To prove this note first that G*(v)/G*(w) is of finite order p if b* is of 
finite order. The order of d*, if finite, is therefore a multiple pn of p. It follows 
from (3) and (vi) that there exists an element c’, satisfying Sc=Sc’ and 
P(pn, b*) =S?*c’". Now it follows from (2.11) and general theorems in the 
theory of extensions of groupst that the group G(v) which is generated by 
G(w) and an element b subject to the relations bxb-!=2* for x in G(w) and 
b?=c, if b* is of infinite order and G*(v)/G*(w) of finite order p, or b? =c’, if 
b* is of finite order pn and G*(v)/G*(w) of finite order p, satisfies the condi- 
tions (a) to (d). 


(2.13) There exists one and only one 1somorphism y(v) of G(v)/S upon 
G*(v) which satisfies 


(Sb)1) = (Sx) = (Sx)1™) 

for x in G(w). 

This is an obvious consequence of the following facts: 6 induces in S and 
in G(v)/S the identical automorphism; if G*(v)/G*(w) is of finite order p, 
then b*? = (Sc) = (Sc’)1™, 

That these definitions of G(v) and y(v) are in accordance with (i) to (iv) 
is clear. Any pair of elements in G(v) has the form xb‘, ybi with x, y in G(w). 
Hence, by (1), (i), (v) is a consequence of 


7 A. M. Turing, The extension of groups, Compositio Mathematica, vol. 5 (1938), pp. 557-567. 
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° (Sy)™™)((Sx)7™ ° b*?)(x, y) 
= (Sxb‘)7) (Sybi)1), 


Suppose that G* contains elements of order m and that xb‘ is an element 
in G(v) such that (xb‘)™ is an element of S. Then it follows from (i) and (1.3) 
that 


xbi)™ = —im(m— x™him- 
(xb* ((Sx)7™ 


consequently x”b'" is an element of S. Therefore i=0, if G*(v)/G*(w) is 
infinite, and S"x™=P(m(Sx)™™) by (vi) (which condition is satisfied in 
G(w)). If G*(v)/G*(w) is of finite order p, then it may be assumed without 
loss in generality that 0<i<p. If i=0, then the above argument may be 
applied; if «+0, then i and # are relatively prime. Since b is an element of 
G(w), this implies that m is a multiple of p. Since 7 is relatively prime to p, 
and since b‘ and xb‘ are elements of the same class (mod G(w)), it follows 
that the order of b*‘ is finite and a divisor of the order of (Sxb‘)”™. Hence 
pn is a divisor of m, and b*”, as well as x”, is an element of S. Consequently 


S™(xbi)™ = ((Sx)™™) Pm, (Sx)7™)) P(m, 
by (vi) applied on G(w), by (2.12), (d) and by condition (3); and 
S™(xbi)™ = P(m, = P(m, 
by (2) and (2.13). Thus the condition (vi) is satisfied by G(v) and y(2). 

Case 2. v is a limit ordinal. Let G(v) be the uniquely determined group 
which contains all the groups G(u) for u<v and is just their join, and let (2) 
be the uniquely determined isomorphism of G(v)/S upon G*(v) which coin- 
cides on every G(u)/S for u<v with y(v). That the conditions (i) to (vi) are 
satisfied is clear. 

Thus G(v), y(v) are defined for every 0SvSt?. Since G*(t) =G*, it follows 
that the group G(#) =G and the isomorphism y(¢) =y meet al] the require- 
ments of the theorem. This completes the proof. 

CorOLLARY 2.2. Suppose that S and G* are abelian groups and that x* o y* 
is, for every x* and y* in G*, an element in S. Then there exists a group G such 
that C(G) $S<Z(G) and an isomorphism y of G/S upon the whole group G* 
such that 

(Sx)7 = (x, 9) 
for x and y in G if, and only if, 


1 = x* 0 x* = (x* 0 y*)(y* 0 2*), x* 0 y*z* = (x* 0 y*)(x* 0 2*). 
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Proof. The necessity of the condition is a consequence of (1.1). Suppose 
now that the condition is satisfied. Denote by F(G*) the subgroup of all the 
elements of finite order in G*. Then there exists a basis B* of F(G*) 
(mod F(G*)){ which contains, for every positive integer i, a basis B¥ of 
G3‘ modulo the cross cut of G*? and G3¥.t (B; is the cross cut of B* and G3:.) 
If m is an odd integer and m a non-negative integer, then every element x* 
in Giz may be represented in one and only one way in the form 

ll (mod G*2), 


b*eBy,* 


h(x*, b*) =0 or 1, and h(x*, b*) =0 for almost every b*. If b#*,---, b* are 
exactly those elements b* in B,* such that h(x*, b*) =1, then put 


P(n2™, x*) = 0 
i<i 


It is now fairly obvious that the functions x* o y* and P(r, x*) satisfy 
the conditions (1) to (3) of the Theorem 2.1, and the existence of a group G 
and of an isomorphism y, meeting the requirements of the Corollary 2.2, is 
a consequence of Theorem 2.1. 


Coro.iary 2.3. If Z and G are abelian groups, then the necessary and suffi- 
cient condition for the existence of a group whose central is Z and whose central. 
quotient group is G is the existence of an operation x o y of the elements x and y in 
G with values in Z, satisfying the conditions: 

(1) xoy is for every x and yinG an element in Z. 

(2) 1=x0ox=(xoy)(yox). 

(3) xoys=(xoy)(x03). 

(4) wox=1 for every x in G if, and only if, w=1. 

This corollary is a consequence of Corollary 2.2 and of (1.2). 

3. Factor sets. Suppose that C(G) <S<Z(G) and that G*=G/S is a di- 
rect product of cyclic groups. Let B* be a basis of G*, and let B be some set 
of representatives in G of the classes in B*. The elements in B may be ordered 
in some way. It is important to note that the following formulas depend on 
the way in which B has been ordered. 

If x* is any element of finite order in G*, then let (x*) be its order. 
Now every element x in G may be represented in one and only one way as 


x= s(x) [J hn(z.d) 


beB 


t K. Taketa, Japanese Journal of Mathematics, vol. 13 (1937), pp. 129-232. 
t Cf., for example, R. Baer, American Journal of Mathematics, vol. 59 (1937), pp. 99-117, 
particularly §1. 
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where the factors of the product are ordered in the same way as B, almost 
every n(x, b) =0, where 0 < n(x, b) <n(Sb) (if Sb is an element of finite order), 
and s(x) is an element in S. 

If x and y are two elements in G, then integers h(x, y; b), which may only 
be 0 or 1, are uniquely determined by the equation 


n(xy, b) + h(x, y; b)n(Sb) = n(x, b) + n(y, b) 
if Sd is of finite order, and 
n(xy, b) = n(x, b) + n(y, d) 
if Sb is of infinite order. Put furthermore 
a(x, y) = 
where |’ indicates a product taken over all those b in B such that Sb is of 
finite order and h(x, y; b) =1; and put 
e(x, y) = 0 Sb) 


where Sb o Sb’ =(b, b’). Then it follows from (1.1) that 
s(xy) = s(x)s(y)a(x, y)c(x, y). 


Note that a(x, y) =a(Sx, Sy) =a(Sy, Sx) and c(x, y) =c(Sx, Sy). The product 
a(x*, y*)c(x*, y*) is, in the usual terminology,} a factor set of the extension G 
of S by G* satisfying S<Z(G). 

4, Existence of transformations. Suppose that S, G and 7, H are groups 
with the property that C(G) < S < Z(G) and C(H) T < Z(H). Then is called 
an S-T-transformation of G into H, if @ obeys the following postulates: 


(1) x¢ is, for every x in G, a uniquely determined element in H. 

(2) (sx)*=s%x* for s in S and x in G. 

(3) @ maps S exactly upon the whole group T. 

(4) & induces in G*=G/S a homomorphism upon the whole group 
H* =H/T. 

(5) If n is a positive integer, and if x and y are elements in G and x", y” 
elements in S, then ((xy)")* = ((ay)*%)" = 

It follows from (2) and (3) that the S-7-transformation ¢ induces in S 
a homomorphism ¢ upon 7, and it follows from (1) to (4) that every element 
in H is, under ¢, the picture of some element in G. Using the conditions (1) 
to (4) we may analyze the condition (5) in the following way. By choosing 
y=1, (5) specializes to the statement: 


t Cf., for example, R. Baer, Mathematische Zeitschrift, vol. 38 (1934), pp. 375-416. 
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(5’) If m is a positive integer, x an element in G, and x* an element in S, 
then (x”)*=(x*)”. 

Since ¢ induces a homomorphism of S upon the whole group 7, it follows 
that (S")*=7™", consequently (5’) implies the condition: 

ConpITIon 4.a. P(S <G; n, x*)"=P(T <H;n, x*) for x* inG,*. 

If (1) to (4) and (5’) hold, and if Sx and Sy are elements in G,*, then 


((xy)*)” = ((xy))* = ((y, x) 
= ((y, 2) 20") 9, 
= (y%, 
This follows from (1.3) and the facts that ¢ is a homomorphism of S upon T 
and of G* upon H%*, and that x*, y” are in S. In view of (1.1) it is fairly ob- 
vious that under the assumption of (1) to (4), condition (5) holds if, and only 
if, condition (5’) and the following condition are satisfied: 

ConpiTIon 4.b. ((S<G; x*, y*)?"")* =(T <H; «*, y®)2"™ for x* and y* 
in Gon*, 

THEOREM 4.1. Suppose that S, G and T, H are groups with the property 
C(G) s$SSZ(G) and C(H) ST SZ(H), that G*=G/S is a direct product of 
cyclic groups, and that conditions 4.a and 4.b are satisfied by the homomorphism 
o of S upon T and the homomor phism d of G* upon H* = H/T. Then there exists 
an S-T-transformation yy of G into H which induces o in S, and d in G* and 


satisfies the condition: 
(6) If T’ is the subgroup of T, generated by the elements 


(T<H; «®)(S<G, x*, 
for x* and y* in G*, then y induces a homomorphism of G upon H/T’. 

Remark. An obvious consequence of the condition (6) is the following con- 
dition: 

(6’) If C(G)SUSS and C(H) =U‘, then y induces a homomorphism of 
G/U upon H/U*. 

Proof. Let B be an ordered set in G representing a basis of G/S. If 6 is 
any element in B, and if the order of Sd is finite, then it follows from Condi- 
tion 4.a that there exists an element f(b) in (Sb)* such that (b*S»)* = f(b) »S», 
where (Sb) is the order of Sd. 

It has been mentioned in §3 that every element x in G may be represented 
in one and only one way in the form 
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a = s(x) [J] 
b 


where s(x) is an element in S, almost every n(x, 6) =0, and 0<n(x, b) <n(Sb), 
if Sb is an element of finite order m(Sb). A function y may be defined for every 
x in G by the equation 


av = s(x)°TT 


beB 


where the factors in the product are ordered in the same way as the elements 
bin B. 

The element «7 is, for every x in G, uniquely determined in H. If s is an 
element in S, and x in G, then s(sx) =ss(x) and n(sx, b) =n(x, b); consequently 
(sx)¥ =5°x7=s7x7. This implies also that y induces ¢ in S. Since B is a basis 
of G (mod S), and since (Sb)7 = Tf(b) = (Sb)*, it follows that y induces ) in G*. 

Suppose now that « is an element in G and ” a positive integer such that x” 
is an element in S. Then n(x, 6) =0 if Sd is of infinite order, and n(x, b)n 
=n'(x, b)n(Sb) if Sb is of finite order. It follows from (1.3) that 

x” = s(x)"a(x")c(x"), 
where 


a(x") Il — (028) 


beB 


([]’ indicates a product over those 6 such that Sb is of finite order), and 


c(x") = II (S < G; Sb’, Sh) 


Computing a@((x7)") and c((x7)"), accordingly, one finds that 
= = = a((x7)*) 
and it follows from Condition 4.b that 
= = c((x7)*). 
Thus finally 


Hence ¥ satisfies the conditions (1) to (4) and (5’), and, since y satisfies Con- 
dition 4.b, this implies that y is an S-7-transformation of G into H. 

If x and y are two elements in G, if a(x, y) and c(x, y) are defined as in §3, 
and if a(«7, and c(x7, y”) are computed accordingly, then 


a(x, y)? a(x, y)? = a(x, 
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and 


(xy)? = y)e(x, y) 
beB 


= y)%c(x7, 
This shows that ¥ satisfies (6) also. 
Lemna 4.2. If C(G) $S<Z(G), if C(H) s<T<Z(H), and if is an S-T- 
transformation of G into H which induces in S the homomorphism o and in 


G/S the homomorphism i, then a necessary and sufficient condition for > to be a 
one-one correspondence is that o and are isomorphisms. 


Proof. The necessity of the conditions is clear. If the conditions are satis- 
fied, and if x and y are two elements in G such that x* = y*, then (Sx) = (Sy), 
and consequently Sx = Sy; that is, =sy for some s in S. Hence y* =x* = (sy) 

=s’y* and x=y, since s’=1 implies s=1. 

5. Homomorphisms and isomorphisms. We prove the following theorem: 

THEOREM 5.1. Suppose that S, G and T, H are groups with the property 
C(G) <SSZ(G) and C(H) ST S2Z(B), that G/S is a direct product of cyclic 
groups, that o is a homomorphism of S upon T and d a homomorphism of 
G* =G/S upon H* =H/T. Then there exists a homomorphism of G into H which 
induces o in S and d in G* if, and only if, 

(a) (S<G; a*, y*)"=(T<H; for x*, y* in G*, 

(b) P(S<G; n, <H; n, x*) for x* in G,*. 

A homomorphism of G into H, inducing o in S and d in G*, is an isomor- 
phism if, and only if, o and d are isomorphisms. 


Proof. The necessity of these conditions is obvious. If, on the other hand, 
o and X satisfy the conditions (a) and (b), then it follows from Theorem 4.1 
that there exists an S-7-transformation y of G into H which induces ¢ in 
S and ) in G* and satisfies the condition (6) of Theorem 4.1. But by condition 
(a) it follows that the subgroup 7’ of 7, mentioned in (6), consists of the 
identity only, and y is consequently a homomorphism. The last statement 
of Theorem 5.1 is an obvious consequence of Lemma 4.2. 

6. Uniqueness theorems. We prove the following theorems: 


THEOREM 6.1. Suppose that S, G and T, H are groups with the property 
C(G) <S<Z(G), sT Z(H), and that G/S is a direct product of cyclic 
groups. Then there exists an isomorphism of G upon H which maps S upon T if, 
and only if, there exists an isomorphism o of S upon T and an isomorphism 
of G*=G/S upon H*=H/T satisfying conditions (a) and (b) of Theorem 5.1. 
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This is an obvious consequencef of Theorem 5.1. 


THEOREM 6.2. Suppose that G/Z(G) is a direct product of cyclic groups. 
Then G and the group H are isomorphic if, and only if, there exists an 1so- 
mor phism o of Z(G) upon Z(H) and an isomor phism d of G/Z(G) upon H/Z(H) 
which satisfy the conditions (a) and (b) of Theorem 5.1. 

This is a consequence of Theorem 6.1 and the fact that an isomorphism of 
G upon H maps Z(G) upon Z(H). 

Remark 6.3. If one assumes that G and H are groups with abelian central 
quotient groups, then Theorem 6.2 is transformed into another true theorem, if 
everywhere the commutator group is substituted for the central. 


Remark 6.4. If G/Z(G) is abelian and countable and does not contain ele- 
ments of infinite order, then G/Z(G) is a direct product of cyclic groups. 

Since G* =G/Z(G) is abelian and does not contain elements of infinite 
order, it is the direct product of its primary components. If w* is an element 
in G* whose order is a power f‘ of the prime number 9, if the equation 
y*»* =~w* has a solution for every positive integer k, and if x* is an element of 
order p/ in G*, then 

(Z(G) < G; w*, x*) = w* o x* = v*” 9 x* for some v* €G* 
= = 1 


by (1.1). Now it follows from (1.2) that w*=1. Since G* is countable, it fol- 
lows from a theorem of H. Priifer{ that G* is a direct product of cyclic groups. 
In Appendix A a uniqueness theorem will be proved in which G/Z(G) does 
not contain elements of infinite order and need not be a direct product of 
cyclic groups. 
Example 6.5. This is to show that Theorem 6.2 does not hold if G/Z(G) is 
a countable abelian group but not a direct product of cyclic groups. 


Denote by A a direct product of a countably infinite number of infinite 
cyclic groups, by a(1), a(2), - - - , a basis of A and by B an abelian group 
which is generated by elements b(1), 5(2),---+, subject to the relations 
b(i+1)"=6(i), for i=1, 2,---, and a fixed integer »>1. Z is the direct 
product of A and B. 

The abelian group G* is generated by elements g(i, j)*, for 7=1, 2 and 
i=1,2,---, satisfying the relations g(i+1, j)**=g(i, 7)*, for i=1,2,---, 


+ That these conditions are sufficient is an obvious consequence of well known theorems in the 
theory of extensions of groups; cf., for example, R. Baer, Mathematische Zeitschrift, vol. 38 (1934), 
p. 410, formula (3), or p. 391, Theorem 2. 

t H. Priifer, Mathematische Zeitschrift, vol. 17 (1923), pp. 48-57. 
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G* does not contain elements of finite order not equal to one, and a normal 
form of the elements in G* is 


g(i, 1)**g(m, 2)*, 


with 0 <h<n, if 1<i, and 0<k <n, if 1<m. 
Essentially the only operation x* o y* of G* in Z which satisfies the condi- 
tions (1) to (4) of Corollary 2.3 is defined by 


1)**g(m, 2)** g(a’, 1)**’g(m’, 2)**" = + m’ — + m — 


The group G is generated by adjoining to Z elements g(i, 7) which are 
subject to the relations 


ZSZG), giit+1,j)” = gli), (g(t, 1), 2)) = + — 1). 


The group W is generated by adjoining to Z elements w(i, 7) which are 
subject to the relations 


Z<Z(W), wit1,j)" = w(i, ali), (w(i, 1), w(k, 2)) = +k — 1). 


It follows from (1.2) that both groups G and W have Z as central and G* 
as central quotient group, and realize the operation x* o y*. Since G* does not 
contain elements other than one of finite order, the conditions of Theorem 6.2 
are satisfied with the exception that G* is not a direct product of cyclic 
groups. But G and W are not isomorphic. For every class, not equal to Z, 
of G/Z contains elements which are n’th powers of elements in G for every 
positive integer r. But if x is an element in the class Zw(1, 1) of W/Z, then 
the integers 7, such that y"" =x has a solution y in W, are bounded, as an ’th 
power of an element in W which is contained in Zw(1, 1) has the form 


2”a(r)”a(r — a(1)w(1, 1), sin Z. 


It is a consequence of Theorem 2.1 and Theorem 6.2 that the problem to 
construct all groups whose centrals are a preassigned abelian group Z and 
whose central quotient groups are isomorphic to a given direct product G* 
of cyclic groups is equivalent to the problem to define all sets of functions 


a*oy*, x*) 
which satisfy the conditions (1) to (4) of Corollary 2.3 and the conditions (2), 
(3) of Theorem 2.1. If 
s*oy*, P(n, x*) and y*, P'(n, x*) 


are two sets of functions of G* in Z, subject to the mentioned conditions, then 
they characterize isomorphic groups if, and only if, there exists an auto- 
morphism ¢ of Z and an automorphism ¥ of G* such that 
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(a) y*1=(x* Oy")! for x*, y* in G*, and 

(b) P(n, x*7) =P’ (n, for x* in G,*. 

That it is not sufficient to assume the existence of one pair of automor- 
phisms y, ¢ satisfying (a) and of another pair of automorphisms ’, ¢’, satisfy- 
ing (b), may be seen from the following example: 

Example 6.6. Suppose that Z is a direct product of a cyclic group of order 
p®, generated by 2, and a cyclic group of order p, generated by u, where p 
is a prime number not 2, and that G* is a direct product of two cyclic groups 
of order p? (b*, c* a basis of G*). Admissible sets of functions of G* in Z may 
be characterized by the equations 


b*¥oc* = vu, P(p?, b*) = P(p’, c*) = 


and another set by the equations 


b* = vu, P(p?, b*)’ = P(p?, c*)’ = 


Clearly there exists a pair of automorphisms vy, ¢ which satisfies (a) and an- 
other pair which satisfies (b) but none which satisfies both (a) and (b). 

7. Types of subgroups. Suppose that G is a group with abelian central 
quotient group, and that the subgroups S and T of G are situated between 
the central and the commutator group of G. Assume furthermore that G/S 
is a direct product of cyclic groups. Then it follows from Theorem 5.1 that 
there exists an automorphism of G which maps S upon T (that is, that S and 
T are isotype in G) if, and only if, there exists an isomorphism ¢ of S upon T 
and an isomorphism \ of G/S upon G/T such that 

(a) (S<G; x*, y*)*=(T <G; x*, y®) for x*, y* in G/S and 

(b) P(S<G; n, x*)*=P(T <G; n, x*) for x* in (G/S),. 

If, in particular, G is an abelian group, then condition (a) may be omitted, 
and this characterization of the types of subgroups applies to any subgroup 
whose quotient group is a direct product of cyclic groups. It applies therefore 
to all subgroups, if, for example, G is a group with a finite number of genera- 
tors or if the orders of the elements in G are bounded. 

8. Conformality. Two groups G and H are said to be conformal, if they 
contain for every positive integer the same number of elements of order n 
and also the same number of elements of infinite order. Since (as obvious ex- 
amples show) infinite abelian groups without elements of infinite order may 
be conformal without being isomorphic, this definition of conformality is too 
wide for our purposes, and therefore the following definition may be adopted. 


DEFINITION 8.1. The groups G and H with abelian central quotient groups 
are said to be conformal if there exist groups S and T between the central and the 
commutator groups of G and H, respectively, and an S-T-transformation of G 
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into H which is a one-one correspondence between G and H and satisfies condi- 
tion (6) of Theorem 4.1. Such an S-T-transformation may be termed an S-T- 
conformality of G upon H. 


Since S-T-conformalities induce isomorphisms in S and G/S and sat- 
isfy condition (5’) of §4, conformal groups are also conformal in the wider 
sense mentioned above. It is a consequence of condition (6) of Theorem 4.1 
that conformalities between abelian groups are isomorphisms. 

Conformality is a symmetric and reflexive relation. But in general it is 
not transitive as follows from Example 8.4. 

Suppose now that C(G) <S<Z(G), C(H) <T<Z(H) and that G/S is a 
direct product of cyclic groups. Then it follows from the results of §4, that 
there exists an S-T-conformality of G upon H if, and only if, there exists an 
isomorphism o of S upon T and an isomorphism A of G/S upon H/T such that 

(a) P(S <G; n, x*)*=P(T <H; n, x*) for x* in (G/S), and 

(b) (S<G; x*, =(T <H; x*, y*)2"" for x*, y* in (G/S)o™. 

Exactly those pairs o, \ which satisfy the conditions (a) and (b) may be 
induced by S-T-conformalities of G upon H.. 

As condition (b) is not very restrictive, this shows that conformality de- 
pends essentially on the P-functions. 


THEOREM 8.2. If G is a group with abelian central quotient group, and if 
G/C(G) is a direct product of cyclic groups, then the following three properties 
of G are equivalent: 

(a) Gis conformal to an abelian group. 

(b) If x and y are elements of G and if nis a positive integer such that x” and 
y” are elements of C(G), then xy" =(xy)”. 

(c) (C(G) <G; x*, y*)?"" =1 for x*, y* in (G/C(G))a. 

Proof. If (a) holds, then there exists a group S between C(G) and Z(G), 
an abelian group H and a subgroup T of H, and an S-7-conformality ¢ of G 
upon H. If m is a positive integer, and if x and y are elements in G such that 
x" and y" are elements in C(G), then x" and y" are elements in S; hence 


((xy)")* = = (x) 
= (x")*(y")* = (x"y)®, 


since G*=H is an abelian group, condition (5’) holds, and ¢ is a homomor- 
phism on S. Now (b) is a consequence of the fact that @ defines a one-one 
correspondence between G and H. 

If (b) holds true, then it follows from (1.3) that for any two elements x 
and y such that C(G)x and C(G)y are elements of (G/C(G))™ the following 
equalities hold: 
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= (xy)?” = (CG) < G;CG@)x, CG)y)?” y?"; 


and (c) is therefore a consequence of (b). 

Suppose now that (c) is satisfied. Then it follows from Theorem 2.1 that 
there exists an abelian group H, which contains C(G) =S as a subgroup, and 
an isomorphism y of G/C(G) upon H/S such that P(S<H;n, x*) 
= P(C(G) <G; n, x*) for x* in (G/C(G)),. It follows, from the fact mentioned 
in the beginning of this section, that there exists a C(G)-S-conformality of 
G upon H which maps every element in C(G) =S upon itself and induces y 
in G/C(G). Hence (a) is a consequence of (c). 

Note that the existence of a basis of G/C(G) has been needed only to 
prove that (a) is a consequence of (c). 


THEOREM 8.3. If G is a group with abelian central quotient group, if G/C(G) 
is a direct product of cyclic groups, and if H and H' are abelian groups which 
are both conformal to G, then H and H’ are isomor phic. 


Proof. From the assumptions there exist subgroups S and S’ between 
C(G) and Z(G), an S-S*-conformality ¢ of G upon H, and an S’-S’*’-con- 
formality ¢’ of G upon H’. Since H and H’ are abelian groups, ¢ and ¢’ are 
isomorphisms in C(G), it follows from condition (6) of Theorem 4.1 that ¢ is a 
C(G)-C(G)*-conformality of G upon H, and ¢’ is a C(G)-C(G)*’-conformal- 
ity of G upon H’. Consequently there exists an isomorphism o of C(G)*=T 
upon C(G)*’=T’ and an isomorphism \ of H/T upon H’/T’ such that 


P(T < H; n, x*)* = P(T’ < H’; n, x™) 


for x* in (H/T),. Since G/C(G), H/T, and H’/T’ are isomorphic groups and 
therefore direct products of cyclic groups, and since H and H’ are abelian 
groups, it follows from Theorem 5.1 that there exists an isomorphism of H 
upon H’ which induces ¢ in T and d in H/T. 

This proof together with the considerations in §7 makes it clear that the 
investigation of the functions P(n, «*) is a problem which is equivalent to 
the problem of characterization of the classes of isotype subgroups of an 
abelian group. 

Example 8.4. This is to show that the Theorem 8.3 is no longer true, when 
the condition of the existence of a basis of G/C(G) is omitted, and this proves 
incidentally that the conformality relation is not transitive. 

Let A be the direct product of infinite cyclic groups with the basis a(1), 
a(2),-- +, and let B be the abelian group generated by the elements 5(1), 
b(2), - - - subject to the relations 


b(i + 1)" = (4), 
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where 1 <m is a fixed odd integer. Z denotes the direct product of A and B. 

A* denotes the direct product of infinite cyclic groups with the basis 
a*(1), a*(2), - - - and B* the abelian group generated by elements 0*(i, 7), 
for7=1,2,--- and j=1, 2, subject to the relations 


b*(i + 1, 7)” = B*(i, 

G* denotes the direct product of A* and B*. 

An operation x* o y* of G* in Z, satisfying the conditions (1) to (4) of 
Corollary 2.3, is characterized by the equations 

a*(2i — 1) o a*(2i) = a(i), 
a*(i)oa*(j)=1, for i<j, (i, 7) ¥ (2k — 1, 2k), 
b*(i, j) 0 a*(k) = 1, 
b*(i, 1) 0 b*(k, 2) = + — 1). 

The’ group G may be the group, generated by adjoining to Z elements 

u(1), 2, - - - ;7=1, 2, subject to the relations 


Z=Z@), (u(i), = a*(z) oa*G), (u(i), o(k, = 1, 
(v(7, 1), o(k, 2)) = b*(i, 1) b*(k, 2), 
v(i + 1, 7)" = v(i, 7). 
G is an extension of Z by G* which realizes the operation x*‘o y*. Hence 
Z=C(G) =Z(G). 

It is easily seen that G is conformal to the abelian group H which is the 
direct product of Z and G*. 

Denote by U’ the direct product of infinite cyclic groups generated by the 
elements «’(1), u’(2),---, by B’ the direct product of the infinite cyclic 
groups, generated by the elements b’(i, 7), fori=1,2, - - - ,7=1,2,and by A’ 
the subgroup of B’ which is generated by the elements a’(2(i—1)+7) 
=b’(i, 7)b’(i+1, 7)~” and is the direct product of the cyclic groups generated 
by the elements a’(k). Let H’ be the direct product of U’, B’, and B, and 
denote by Z’ the direct product of A’ and B. 

A Z-Z'-conformality ¢ of G upon H’ is defined by 


u'(i) = u(i)*, b’(i, j) = v(%, 
a’(i) = a(i)*, b(t) = b(i)*, 


since ¢ defines an isomorphism of Z upon Z’ and of G* upon H’/Z’. But H 
and H’ are abelian groups which are not isomorphic,f since H is a direct prod- 


t Cf. R. Baer, Duke Mathematical Journal, vol. 3 (1937), pp. 69-122, Corollary 2.9. 
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uct of cyclic groups by three groups of the type of B, whereas H’ is a direct 
product of cyclic groups by one group of the type of B. 

It is a special case of Theorem 8.2 that the group G is conformal to an 
abelian group, if C(G) < Z(G), and if G/C(G) is a direct product of cyclic 
groups and does not contain elements of even order. Suppose on the other 
hand that C(G) <S< Z(G) and that 2” with 0<~m is the l.c.m. of the orders 
of the elements in G/S. If G is S- - - - -conformal to an abelian group, and 
if w* is an element of order 2” in G/S, then 


1 = (S <G; w*, x*)?"" = (S <G; w®”", x*) 


for every x* in G/S; and it follows from (1.1) that the elements in w*?”™ are 
elements of Z(G). Since 14w*?"™, this implies that S¥Z(G). 

It is fairly obvious how to construct groups of order a power of 2 which 
are conformal to an abelian group. Let, for example, be an integer, (1<m), S 
a cyclic group of order 2"~! generated by s, and G* a direct product of two 
cyclic groups of order 2", (u*, o* a basis of G*). The group G with C(G) <S 
< Z(G) which realizes the functions characterized by 


u*ovt=s, P(2*, u*) = P(2*, =1 


is, by Theorem 8.2, conformal to an abelian group, though not itself abelian. 
If in this construction it had been assumed that S was of order 2", then the 
group G would not be conformal to any abelian group, since S=C(G) =Z(G), 


and it may be computed that the numbers of the elements of a given order in 
G are not the same as the corresponding numbers in any abelian group. 

9. Groups with isomorphic commutator groups and isomorphic central 
quotients groups. If ¢ is a homomorphism of the group G upon the whole 
group H, then C(G)*=C(H) and Z(G)*<Z(H). If S is a normal subgroup 
of G, then S* is a normal subgroup of H, and ¢ induces an isomorphism of 
G/S upon H/S¢, if S is the complete origin of S¢. 

Lema 9.1. If G is a group with abelian central quotient group, and if > is 
a homomor phism of G upon the (whole) group H which induces an isomorphism 
in C(G), then Z(G)*=Z(H), Z(G) is the complete origin of Z(H), and $ induces 
an isomorphism in G/Z(G). 

Proof. Let w be any element in G such that w® is an element in Z(H). If x 
is any element in G, then (w, x)*=(w*, x*) =1, consequently (w, x) =1 for 
every x in G; that is, w is an element in Z(G). This proves the lemma. 

That the converse does not hold, may be seen from the following example. 

Let p be a prime number not 2, and let Z and G* each be a direct product 
of three cyclic groups of order p. If z, 2’, and z’’ form a basis of Z, and u*, o*, 
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and w* a basis of G*, then an operation x* o y* of G* in Z which satisfies the 
conditions (1) to (4) of Corollary 2.3 is characterized by the equations 


u*ovt =z, v*o w* = 2’, wtou* = 2", 


The functions P(p, x*) =1, for x* in G*, satisfy the conditions of Theorem 2.1 
with the operation x* o y*. It is therefore a consequence of Theorem 2.1, (1.2), 
and Theorem 6.1 that there exists one, and essentially only one, group G such 
that Z = Z(G) (or C(G)) and G/Z =G* which realizes the functions x* o y* and 
P(p, x*). 

Denote by M the subgroup of G generated by z, and put H=G/M. Then 
it follows from (1.2) that Z(H) =C(H) =Z(G)/M, but this homomorphism 
of G upon H is not an isomorphism in C(G). 


THEOREM 9.2. Suppose that G and G’ are groups whose central quotient 
groups are abelian. Then there exists an isomorphism y of C(G) upon C(G’) 
and an isomorphism d of G/Z(G) upon G'/Z(G’) with the property that 


(ZG) <G; x*, y*)¥ = <G’; 


for x*, y* in G/Z(G) if, and only if, there exists a group H with abelian central 
quotient group and homomorphisms > and $' of H upon G and G’, respectively, 
which induce isomorphisms in C(H). 


Proof. Suppose first that there exists a group H with abelian central quo- 
tient group and homomorphisms ¢ and @¢’ of H upon G and G’, respectively, 
which induce isomorphism in C(H). Then it follows from Lemma 9.1 that 
Z(H)*=Z(G), Z(H)*’ =Z(G’), and that ¢ and ¢’ induce isomorphisms in 
H/Z(H). Thus ¢~! is an isomorphism in C(G) and in G/Z(G), and y=¢7—'9’ 
is an isomorphism of C(G) upon C(G’), \=¢~'¢’ is an isomorphism of 
G/Z(G) upon G’/Z(G’). 

If x and y are two elements in G, then there exist some elements u and v 
in H such that“u¢ =x, v¢=y. Consequently 


(x, y)¥ = (u, = (u%’, v9’), 
(Z(G) x, ZG)y)* = (ZG@)x)*, @@y»)), 
and the condition is a sufficient one. 
Suppose now that there exists an isomorphism y of C(G) upon C(G’) and 
an isomorphism \ of G/Z(G) upon G’/Z(G’) such that 
(i) (Z(G) <G; x*, y*)¥=(Z(G’) <G’; for x*, y* in G/Z(G). 


Since G/C(G) is an abelian group, there exist a direct product G* of infinite 
cyclic groups and a homomorphism y of G* upon G/C(G). There exist by 
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Theorem 2.1 and by Theorem 6.1 one, and essentially only one, group K, 
such that C(K) =C(G) <Z(K), and an isomorphism « of K/C(K) upon G* 
such that 


(C(K) < K; x*, y*) = <G; y**7) 


for x*, y* in K/C(K). Since K/C(K) is a direct product of infinite cyclic 
groups, Z(K)/C(K) is also a direct product of infinite cyclic groups,f con- 
sequently Z(K) is the direct product of C(K) and a group D which is a direct 
product of infinite cyclic groups. Since xy is a homomorphism of K/C(K) 
upon G/C(G), since K/C(K) is a direct product of infinite cyclic groups, it 
follows from the above equation and Theorem 5.1 that there exists a homo- 
morphism p of K upon G which leaves the elements in C(K) =C(G) invariant 
and induces xy in K/C(K). 

Similarly there exists a group K’, such that C(G’)=C(K’)<2Z(K’), 
K'/C(K’) is a direct product of infinite cyclic groups, and Z(K’) is the direct 
product of C(K’) and a group D’ which is a direct product of infinite cyclic 
groups; and there exists a homomorphism p’ of K’ upon G’ which leaves all 
the elements in C(K’) =C(G’) invariant. 

Since G/Z(G) is an abelian group, there exists a direct product H* of 
infinite cyclic groups and a homomorphism : of H* upon G/Z(G). Denote by S 
the direct product of D, D’, and C(G), and denote by H the essentially 
uniquely determined group such that C(H) <S<Z(H) and such that there 
exists an isomorphism 7 of H/S upon H* which satisfies the equation 


(S < H; x*, y*) = <G; y*r*) 


for «*, y* in H/S. 

Since H/S is a direct product of infinite cyclic groups, H/C(H) is the di- 
rect product of S/C(H) and a direct product D* of infinite cyclic groups. 

By Lemma 9.1, p induces an isomorphism of K/Z(K) upon G/Z(G); con- 
sequently the isomorphism p~! is well defined on G/Z(G). Since D* is a direct 
product of infinite cyclic groups and since D* represents exactly H/S, there 
exists a homomorphism o¢ of H/C(H) upon K/C(K) with the following prop- 
erties: «**=1, if x*=C(H)d’ and d’ in D’; x**=C(K)d, if x*=C(HA)d and d 
is an element of D; and Z(K)** =(Sx*)"#™, if x* is an element of D*. 

Since p induces xy in K/C(K), it follows that 


(S < H; x*, y*) = (Z(K) < K; 


for x*, y* in H/S. The homomorphism ¢ therefore satisfies 


+ R. Baer, Duke Mathematical Journal, vol. 3 (1937), pp. 69-122, especially the remark to 
Corollary 8.9. 
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(C(K) < K; x**, y**) = (C(H) < H; x*, y*) 


for x*, y* in H/C(A). 

By Lemma 9.1, p’ induces an isomorphism of K’/Z(K’) upon G’/Z(G’); 
consequently the isomorphism p’~! is well defined on G’/Z(G’). There exists, 
therefore, a homomorphism a’ of H/C(H) upon K’/C(K’) with the following 
properties: =C(K’)d’, if x*=C(H)d’ and d’ is an element in D’; =1, 
if x* =C(H)d and d is an element in D; and Z(K’)x*’ =(Sx*)"*’"", if x* is 
an element of D*. 

Since p’ is a homomorphism of K’ upon G’ which leaves all the commuta- 
tors invariant, it follows from condition (i) that 


(C(K’) < K’; x*’, = (2(K’) < K’; Z(K’)x**’, Z(K’) 
= (Z(K’) < K’; 
= ZG’) <G’; (Sx*)", (Sy*)") 
= (ZG) <G; (Sx*)", 
= (S < H; Sx*, Sy*)¥ = (C(A) < H; x*, y*)¥ 


for every x*, y* in H/C(H). 

Since H/C(H) is a direct product of infinite cyclic groups, it follows from 
Theorem 5.1 that there exists a homomorphism 8 of H upon K which induces 
a in H/C(H) and leaves the elements in C(H) =C(K) invariant, and that 
there exists a homomorphism #’ of H upon K’ which induces o’ in H/C(H) 
and y in C(H) =C(G). Bp and 6’p’ are homomorphisms of H upon G and G’, 
respectively. 8p leaves the elements in C(H) = C(K) =C(G) invariant, and 6’ p’ 
induces in C(H) the isomorphism y. This completes the proof. 

In fact slightly more has been proved, namely, that the group H, which 
meets the requirements of the Theorem 9.2, may always be chosen in such 
a way that H/C(H) is a direct product of infinite cyclic groups. 

The proof of this theorem shows the advantage of treating finite and in- 
finite groups at the same time. For even if the groups G and G’ of the theorem 
are finite, it may prove difficult to construct a group H which meets all the 
requirements without recourse to infinite groups. 

10. Automorphisms. If G is a group with abelian central quotient group 
and S a subgroup of G such that C(G) <S<Z(G), then denote by 0(S <G) 
the group of all those (proper) automorphisms of G which map S upon itself. 
If S is, for example, the central or the commutator group of G, then 0(S <G) 
is the group of all automorphisms of G. 

Those automorphisms of G which leave invariant every element in S and 
every element in G/S form a normal subgroup 2(S <G) of O(S<G). If ¢ is 
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any automorphism in 2(S <G), x an element in G and s an element in S, then 


= (5x)9(sx)-! = (Sx) *(Sx)-! = (6, Sx) 
is an element in S; and if x and y are two elements in G, then 


(¢, Sxy) (xy)*(xy)-* = LP 
= x*(o, Sy)x-! = Sx)(¢, Sy), 


and conversely every homomorphism of G/S into S may be realized this way. 
Since finally 


(dy, Sx) = = 
(¢, Sx)"(y, Sx) = ($, Sx)(y, Sx), 


as y belongs to 2(S <G) and (¢, Sx) to S, it follows that 2(S <G) is essentially 
equal to the group of all homomorphisms of G/S into S. 

Two automorphisms in 0(S<G) belong to the same class of A(S<G) 
= 0(S <G)/Q(S <G) if, and only if, they induce the same automorphisms in 
S and in G/S. Thus A(S <G) is essentially equal to a subgroup of the group 
II(S, G/S) of all the pairs (¢, \) of automorphisms o of S and X of G/S. If 
the pair (¢, A) in II(S, G/S) is realized by an automorphism in 0(S <G), then 
it follows that 

(a) (S<G; x*, y*)"=(S <G; «®, y*) for «*, y*, in G/S, and 

(b) P(S<G; n, x*)* = P(S <G; n, x*) for x* in (G/S),; 
and these conditions are sufficient for the realizability, provided G/S is a di- 
rect product of cyclic groups, as follows from Theorem 5.1. 

Suppose now that ¢ is an automorphism in 2(S<G) and that the auto- 
morphism y of G induces in S the automorphism o and in G/S the automor- 
phism \. Then 


(yey, Sx) = = (x7 
($, = (9, 
consequently it follows that the class of A(S<G) which is characterized by 
the pair (¢, \) in II(S, G/S) induces, in the abelian group 2(S <G), the auto- 
morphism defined by 


for «* in G/S. 

In order to characterize the extension 0(S <G) of Q(S<G) by A(S<G) 
which realizes the above mentioned automorphisms in 2(S <G), it is neces- 
sary to compute the so-called factor sets. A method for their calculation may 
be indicated for groups G, S which satisfy the conditions: 
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(i) C@)sSsZ@). 
(ii) G/S =G* is a direct product of cyclic groups. 

(iii) If x and y are elements of G and 7 is a positive integer such that x* 
and y” are elements of S, then x"y*=(xy)*. 

First we define in S for every positive integer m a function s*”’ as follows: 
1=1""’; if the equation x"=s has a solution in S, then s*™ is one of these 
solutions x; if s is not contained in S*, then s*™ is not defined. 

Second an ordered set B is chosen in G which represents exactly a basis B* 
of G/S, and the functions 


s(x), m(x,b), a(x, y) = a(Sx, Sy), c(x, y) = c(Sx, Sy), 


for x, y in G and bd in B, introduced in §3, are defined with reference to the 
ordered set B. 
In defining another multiplication of the elements of G by the equation 


= c(x, xy, 


for x, y in G, G is transformed into an abelian group H, and S is a subgroup 
of H, since c(x, y) =1 for x, y in S. Thus the identical mapping of the elements 
in G is an S-S-conformality of G upon H. Furthermore G/S=H/S and 
P(S<G; n, x*) =P(S<H; n, x*) for x* in G,*. Finally 0(S <G) =2(S <H) 
and A(S<G)<A(S<H), as follows from the conditions (a) and (b) men- 
tioned above. 

Suppose now that (¢, A) is a pair in II(S, G/S) which may be realized by 
some automorphism in 0(S <G). Then 


= II | 
d*eB* 


where almost every n(b*, d*, \) =0 and 0 <n/(b*, d*,r) <n(d*), if d* is of finite 
order n(d*). If b* is an element of finite order n= n(b*), and if din B always 
represents the element d* in B*, then it follows from the realizability of (¢, A) 
that the elements 

= s(b, d) 


deB 


are nth powers of elements in S. Consequently, an automorphism  =(¢, A) 
in 0(S <G) is defined by the equations 


x’ = x for x in S, 


br = o, 


deB 
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where ¢(b, , \) = 1, if d* is of infinite order, and ¢(b, , \) =s(b, a, d)"*)", if b* 
is of the finite order n(b*). 

It is easily seen that there exists an automorphism in 0(S <H) which has 
the same values as y on S and on B. But these automorphisms do not con- 
stitute the same mapping of the elements which form both groups G and H. 

Thus a representative of every class of A(S <G) has been chosen, and the 
factor sets are expressions 


A)y(o’, AX’)—! = alo, A; a’, A; 0’, d’) 


where a( - - - ) is the factor set, calculated for the group H, and c(- - - ) is 
an expression in the (S <G; b*, d*). 

If all the elements of G/Z(G) are of finite odd order, then the investiga- 
tions of Appendix B will produce better results. But the methods of Appendix 
B break down in some of those cases in which the method of this section may 
be applied. 

Appendix A. A uniqueness theorem for p-adic groups. It is the object of 
this appendix to prove a uniqueness theorem which is not contained in Theo- 
rem 6.2. Because of Example 6.5 only groups without elements of infinite 
order will be considered, and by Remark 6.4 it will be necessary to consider 
groups whose central quotient group is infinite but not countable. 

Groups whose central quotient group is abelian are direct products of 
p-groups (that is, groups which contain only elements of order a power of 
the prime number ?) if, and only if, they do not contain elements of infinite 
order.t Consequently only p-groups will be considered in this appendix. 

The p-group G will be called p-adic if the cross cut of its subgroups G”, 
for i=1, 2, - - - , consists of the group unit element only. The p-group G is 
said to be dense in the p-group H, if G<H and every class of H/H” contains 
elements of G. Finally G is termed a closed p-adic group if G is p-adic and if G 
is the only p-adic group in which G is dense. 


Lema A.1. (a) Every p-adic group is dense in one and essentially only one 
closed p-adic group. 

(b) If the p-adic group G is dense in the closed p-adic group G, and if the 
p-adic group H is dense in the closed p-adic group H, then every isomorphism y 
of G upon the whole group H is induced by one and only one isomorphism of G 
upon H. 


t+ Cf., for example, R. Baer, Compositio Mathematica, vol. 1 (1934), pp. 254-283, especially the 
lemma on p. 261. 
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Similar statements have been proved before by similar methods, f and it 
will therefore be sufficient to give the following indication of a proof of this 
lemma. If G is a p-adic group, then consider the set S(G) of all the sequences 
S; with the following properties: 

(a) S;is a class of G/G”. 

(b) 

(c) The orders of the elements S; of the groups G/G” are bounded. (The 
upper bounds for the orders may be different for different sequences.) 

If the multiplication in S(G) is defined in the obvious way, S(G) becomes 
a closed p-adic group. In mapping the element x of G upon the sequence G?*x 
an isomorphism of G upon a dense subgroup of S(G) is defined. Now it is 
fairly obvious how to work out a proof of the lemma. 

If G is a p-adic group, then the essentially uniquely determined closed 
p-adic group G, in which G is dense, may be called the p-adic closure of G. 

The following example shows the existence of closed p-adic groups whose 
central quotient group is abelian but not a direct product of cyclic groups. 

Denote by Z; and Z* cyclic groups of order p‘, generated by 2; and 2¥, re- 
spectively; and let Z be the direct product of the groups Z;, fori=1,2,---, 
and Z* the direct product of the groups Z*, for i=1, 2, - - - . An operation 
x* o y* and functions P(p‘, x*) of Z* in Z, which satisfy the conditions (1) to 
(4) of Corollary 2.3 and the conditions (1) to (3) of Theorem 2.1, may be 
defined by the following equations: 


sfozki=2, (for1 <h); P(p',z¥) = 1. 


Thus there exists one, and essentially only one, group G which satisfies 
Z=C(G) =Z(G), Z* =G/Z, and which realizes the functions x* 0 y*, P(p', x*). 
Gis clearly a p-adic group. 

If Gis the p-adic closure of G, then Z(G) is the p-adic closure of Z(G), 
and G/Z(G) is the p-adic closure of G/Z(G). It is well known that these 
abelian groups are not direct products of cyclic groups. 


THEOREM A.2. Suppose that G is a closed p-adic group with abelian central 
quotient group. Then G and H are isomorphic groups if, and only if, 

(1) H is a closed p-adic group; 

(2) there exists an isomorphism ¢ of Z(G) upon Z(H) and an isomorphism 
of G/Z(G) upon H/Z(H) which satisfy the conditions: 

(a) (Z(G) <G; x*, = (Z(H) <H; x, y™) for x*, y* in G/Z(G); 

(b) P(Z(G) <G; x*)§=P(Z(H) <H; p', x*) for x* in (G/Z(G)),¥. 


7 R. Baer, Journal fiir die reine und angewandte Mathematik, vol. 160 (1929), pp. 208-226. 
H. Freudenthal, Compositio Mathematica, vol. 4 (1937), pp. 145-234. 


> 
4 
. 
3 
{ 
j 
i 
4 
9 
i 
; 


382 REINHOLD BAER [November 


If condition (1) is satisfied by H, then exactly those pairs of isomorphisms 
¢, \ which satisfy (a) and (b) may be induced by isomorphisms of G upon H. 

Proof. The necessity of the conditions appearing in either of the state- 
ments of the theorem is fairly obvious. Suppose now that condition (1) is 
satisfied by H, and that ¢, \ is a pair of isomorphisms satisfying the conditions 
(a) and (b). It shall be proved that there exists an isomorphism of G upon H 
which induces ¢ in Z(G) and \ in G/Z(G). 

Since G* = G/Z(G) is an abelian p-group, there exists a direct product D* 
of cyclic groups which is dense in G*.f Then D* is dense in the abelian 
p-group G**=H/Z(H). If D is the group such that Z(G) <D<G, D/Z(G) 
= D*, and if D’ is the group such that Z(H) <$D’ <H, and D’/Z(H) =D”, 
then D is dense in G and D’ is dense in H. It is now a consequence of Theorem 
5.1 and the conditions (a), (b) that there exists an isomorphism 6 of D upon 
D’ which induces ¢ in Z(G) and ) in D*; and it follows from Lemma A.1, (b) 
that there exists a uniquely determined isomorphism yy of G upon the closed 
p-adic group H which induces 6 in D. Clearly y induces ¢ in Z(G). 

Applying the argument used in Remark 6.4 we can show that G/Z(G) isa 
p-adic group. Consequently there exists at most one isomorphism of G/Z(G) 
upon H/Z(H) which induces a given isomorphism in the group D*, dense in 
G*. Since y and ) are equal isomorphisms on D*, this implies, therefore, that y 
induces ) in the whole group G/Z(G). 

It may be noted that by the method applied in the proof even the follow- 
ing slightly more general statements may be proved: 


Coro.iary A.3. Suppose that G and H are closed p-adic groups and that 
C(G) s$S<2Z(G), C(A) ST SZ(H). Then there exists an isomorphism of G upon 
H which maps S upon T if, and only if, there exists an isomorphism o of S upon 
T and an isomorphism d of G/S upon H/T which satisfy the conditions (a) 
and (b) of Theorem 5.1. If furthermore G/S is a p-adic group, then exactly the 
pairs o,d, satisfying (a) and (b) of Theorem 5.1 may be realized by isomorphisms 
of G upon H which map S upon T. 


Appendix B. Conformalities which preserve the automorphisms. We prove 
the following theorem: 


THEOREM B.1. If Gis a group with abelian central quotient group, if the sub- 
group S of G does not contain elements of order 2, and if C(G) <S?<S<Z(G), 
then there exists an abelian group H and a one-one correspondence which maps 
G upon the whole group H and satisfies the conditions: 

(1) (xy)*=x*%y?¢ for xin Z(G) and y inG. 


t R. Baer, American Journal of Maihematics, vol. 59 (1937), pp. 99-117, §1. 
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(2) («")*=(a*)” for x in G and any integer n. 

(3) induces a homomorphism of G upon H/S?. 

(4) If y is a (proper or improper) automorphism of G such that SSS, then 
o-'y¢ is an automorphism of H. 

Note that ¢ induces isomorphisms in Z(G) and in G/S. 

Proof. Since C(G) < S?, and since S does not contain elements of order 2, 
there exists to every pair x, y of elements in G a uniquely determined ele- 
ment f(x, y) in S such that f(x, y)?=(x, y). Since f(x, y) is the only solution 
of this equation in S, and since S is an abelian group, it follows that 


f(x, x) = f(x, f(y, *) = 1, f(x, yz) = f(x, ») f(x, 2); 
and 
= 1 
if, and only if, (x, y) =1. 
Denote now by ¢ a one-one correspondence which maps G upon a set H 
of elements. In H a multiplication may be defined by the equation 
aby? = (f(y, x)xy)? 


for x and y in G. The product of any two elements in H is uniquely determined 
by this definition. Since f(x, y) =1 for x in Z(G), this correspondence ¢ be- 
tween the group G and the multiplicative manifold H satisfies condition (1); 
and it satisfies (2), since f(x‘, x’) =1 for x in G and any integers 7 and 7. This 
implies in particular that the picture of the group unit in G is the uniquely 
determined unit in H, and that the inverse of any element in H is uniquely 
determined. That the multiplication in H is commutative follows from 
= (f(y, x)xy)* = (f(y, x)(*, »)yx)* = (f(x, y) yx)? = yous; 
and that the multiplication in H is associative, follows from 


(x?y*)2¢ = (f(y, x)xy)2 = (f(z, f(y, x)xy) f(y, x) xys)* 
= f(y, x)*f(z, x)*f(z, y)*(xyz)* 


= x9(f(z, y)y2)* = (f(f(z, »)¥2, x)xf(y, 2) y2)* 
S(y, 2) °f(2, x) *f(y, x)*(xyz)*. 
Consequently H is an abelian group and ¢ satisfies (3), since all the elements 
f(x, y) are contained in S. 


t The idea for this proof has been suggested by an argument used by C. Hopkins for the proof 
of a similar theorem, cf. C. Hopkins, these Transactions, vol. 37 (1935), pp. 169-170. 
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Suppose finally that y is an automorphism of G such that Sy<S. Then 
(f(x, = (f(x, y)?)” = (x, = (x7, = 
consequently 
= 9”) 
since f(x, y)7 is an element of S. Hence 
"76 = (f(y, = (f(y, x) 


= (x?) 


and this completes the proof of the Theorem. 

Example B.2. This example is to show that the Theorem B.1 would not 
hold without the assumptions concerning the existence of a subgroup S with- 
out elements of order 2 such that C(G) <S?sS. 

Denote by Z an infinite cyclic group generated by an element z, and by G* 
a direct product of two infinite cyclic groups. If u*, »* is any basis of G*, then 
an operation x* o y* of G* in Z which satisfies the conditions (1) to (4) of Cor- 
ollary 2.3 is characterized by u* o v* =z. Let G be the (essentially uniquely de- 
termined) group such that Z=C(G)=Z(G), G*=G/Z which realizes the 
operation x*oy*. 

This group is conformal to a direct product of three infinite cyclic groups. 

Suppose that ¢ is a one-one correspondence which maps G upon some 
group H and which satisfies the conditions: 

(i) (xy) *=x*%y¢ for x in Z(G) and y in G. 
(ii) (x")*=(x*)" for x in G and any integer n. 

(iii) @ induces a homomorphism of G upon H/Z(G)¢. 

(iv) If y is a proper automorphism of G, then ¢~y¢ is an automorphism 
of H. 

Note that (i) and (ii) are identical with the conditions (1) and (2) of the 
Theorem B.1, whereas (iii) and (iv) are even weaker conditions than the cor- 
responding conditions (3) and (4) of the Theorem B.1. 

Since ¢ is a one-one correspondence, an element f(x, y) is uniquely deter- 
mined by the equation 


uty? = (f(y, x)xy)? 
for x and y in G, and it is a consequence of (iii) that f(y, x) is an element of 
Z(G). It is a consequence of (i) that f(x, y) =f(Z(G)x, Z(G) y) is independent 
of the choice of x and y in their respective classes in G/Z(G); and it is a con- 
sequence of (ii) that f(x‘, x’) =1 for x in G and integral i and j. 
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Let now u and v be some representatives of the classes u* and v* of G/Z, 
respectively. Then an automorphism y of G is defined by 


uy = uD, 
since (u7, v7) =(uv, v-!) =(v, wu) v)7. It is therefore a con- 
sequence of condition (iv) that 
(f(v, u) = (f(v, u)uv)7* = (f(v, u)uv) 
= (uty?) = = 
= (u7)*(v7)* = (f(v7, 
and this implies, since f(u, v) is an element of Z, that 
= = f(v7, ut) = un). 
Another automorphism . of G is defined by 


= = 
ut =v, v= 4, st = 


and on applying a similar argument it follows that f(v, u)-!=f(u, v). 
Finally it is a consequence of all these equations and (ii) that 


u*) = = 
= f(u, 0) vu) *f(u-?, *(v, 
= f(u, u)*(v, 
= f(u, v)*f(v, u)* = (f(u, u))*; 


and this last expression is not equal to one, since (v, ~) =z~1 is not the square 
of an element in Z. Thus Z is not an abelian group. 

If Z is a cyclic group of order 2” for 0<m, z an element generating Z, 
if G* is a direct product of two cyclic groups of order 2"+', and if u*, o* is a 
basis of G*, then let G be the group which satisfies C(G) = Z < Z(G), G/Z =G*, 
and which realizes the operations characterized by u* 0 o* =z, P(2"*', u*) 
= P(2"+1, »*) =1. (This group G has been discussed at the end of §8.) G is 
conformal to an abelian group, but practically the same argument as the one 
used above proves that there does not exist a transformation ¢ into an abelian 
group which satisfies the conditions (i) to (iv). 


Coro.iary B.3. If G is a group with abelian central quotient group, and if 
C(G) =C(G)? does not contain elements of order 2, then there exists a conformality 
¢ of G upon an abelian group H which satisfies the conditions (1) and (2) of 
Theorem B.1 and transforms the (proper or improper) automorphisms of G into 
automorphisms of H. 
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This is a consequence of the fact that every automorphism of G maps 
C(G) upon a subgroup of C(G) and of Theorem B.1. 

Note that the conditions of this Corollary B.3 are satisfied, if G/Z(G) is 
an abelian group whose elements are of finite odd order. 

Appendix C. Operator groups. If G is a group, then ® is said to be a set 
of operators for G, if x* is, for every x in G and for every ¢ in ®, a uniquely 
determined element in G such that (xy)*=2x*y*. ® may be called an associa- 
tive set of operators for the group G, if a multiplication of the elements in ® is 
defined which satisfies = x*7, 

If G is a group, and if @ is a set of operators for the group G, then the 
subgroup S of G is said to be ®-admissible if S* <= S for every ¢ in ®. The com- 
mutator group C(G) is @-admissible for every set © of operators. But it is 
easy to construct groups Gand sets ® of operators of G such that the central 
Z(G) is not &-admissible. 

If G is a group, and if ® is a set of operators for G, then the ®-central 
Z(G, ®) of G may be defined as the set of all those elements x in G which 
satisfy xy = yx, for every y in G, and x* =x, for every ¢ in F; and the subgroup 
C(G, ®) of G, which is generated by the elements (x, y) and x¢a—! for x, yin G 
and ¢ in ®, may be called the ®-commutator group of G. Both Z(G, ®) and 
C(G, ©) are normal #-admissible subgroups of G. Z(G, ®) is the greatest sub- 
group of G in which the inner automorphisms of G and the operators in ® 
induce the identity transformation. C(G, ®) is the smallest subgroup of G 
whose quotient group is abelian and in whose quotient group all the operators 
in ® induce the identity only. Thus Z(G, ®) and C(G, ®) are -characteristic 
subgroups of the -operator group G. 

The obvious generalization of the groups with abelian central quotient 
group are the groups satisfying the equation 


CG, = ZG, ®). 


They are groups with abelian central quotient group, since C(G) <C(G, ®) 
and Z(G, ®) <Z(G). Thus it is advisable to consider G as an extension of a 
group S between C(G, ®) and Z(G, ®) (S is then also situated between C(G) 
and Z(G)) by the group G/S=G*. S and G* are abelian groups essentially 
without operators. Hence the only problem is to characterize the operators 
in & by relations between G* and S, and this may be done by practically the 
same method as has been used in $10 in order to describe the automorphisms 
in the group 2(S<G) since the elements in @ induce automorphisms of G 
which belong (as automorphisms) to 2(S <G). 
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GROUPS WITH PREASSIGNED CENTRAL 
AND CENTRAL QUOTIENT GROUP* 


BY 
REINHOLD BAER 


Every group G determines two important structural invariants, namely 
its central C(G) and its central quotient group Q(G) =G/C(G). Concerning 
these two invariants the following two problems seem to be most elementary. 
If A and B are two groups, to find necessary and sufficient conditions for the 
existence of a group G such that A and C(G), B and Q(G) are isomorphic 
(existence problem) and to find necessary and sufficient conditions for the 
existence of an isomorphism between any two groups G’ and G”’ such that 
the groups A, C(G’) and C(G’’) as well as the groups B, 0(G’) and Q(G’’) are 
isomorphic (uniqueness problem). This paper presents a solution of the exist- 
ence problem under the hypothesis that B (the presumptive central quotient 
group) is a direct product of (a finite or infinite number of finite or infinite) 
cyclic groups whereas a solution of the uniqueness problem is given only un- 
der the hypothesis that B is an abelian group with a finite number of genera- 
tors. 

There is hardly any previous work concerning these problems. Only those 
abelian groups with a finite number of generators which are central quotient 
groups of suitable groups have been characterized before. 

1. Before enunciating our principal results (in §2) some notation and con- 
cepts concerning abelian groups will have to be recorded for future reference. 

If G is any abelian group, then the composition of the elements in G is 
denoted as addition: «+. If is any positive integer, then G consists of 
all the elements nx for x in G, and G, consists of all the elements x in G such 
that nx =0. F(G) is the subgroup of all the elements of finite order in G, that 
is, the join of the groups G,,, and F(G, p) is the subgroup of all those elements 
in F(G) whose order is a power of the prime number ; in other words, 
F(G, p) is the join of all the groups G,». It is well known that F(G) is the di- 
rect sum of the groups F(G, p). 

A set B of elements in G is termed independent, if all the elements in B 
are non-zero, if none of the elements in B is a multiple of another element in 
B, and if the group, generated by the elements in B, is the direct sum of the 
cyclic groups which are generated by the elements in B. If G is generated by 


* Presented to the Society, September 10, 1937; received by the editors September 20, 1937. 
+ R. Baer, Mathematische Zeitschrift, vol. 38 (1934), p. 406. 
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B, and if B is independent, then B is called a basis of G. It may be mentioned 
that the elements u and » (neither zero) are dependent if there exist integers 7 
and k such that iw = kv +0. 

If G is an abelian group, then there need not exist a basis of G, but there 
always exists a greatest independent subset of G. Each greatest independent 
subset of G contains a certain (finite or infinite) number of elements, and the 
least of these numbers is called the rank r(G) of G. If either every non-zero 
element in G is of infinite order or every non-zero element in G is of order pf, 
then every greatest independent subset of G contains exactly r(G) elements. 

Numerical invariants: 


r(G,0) = (mod FG))), p*) = r((p*"'G), (mod (p'G),)). 


If Gis a direct sum of cyclic groups, then in every decomposition of G into 
indecomposable direct summands there are exactly r(G, 0) cyclic direct sum- 
mands of infinite order and exactly r(G, p*) cyclic direct summands of order 
p’. It is a consequence of this fact that the structure of direct sums of cyclic 
groups is completely determined by the invariants r(G, 0), r(G, p’). 

If S is any subgroup of the abelian group G, then G/S denotes the group 
of classes of residues of G (mod S), and the direct sum of the abelian groups 
G, isdenoted 

2. The following theorems contain the main results of this investigation: 


EXISTENCE THEOREM. If V is an abelian group and if G is a direct sum of 
cyclic groups, then the following conditions are necessary and sufficient for the 
existence of a group whose central is V and whose central quotient group is iso- 
morphic to G: 

(i) If G contains elements of order p‘, then V contains elements of order pi. 
(ii) If G contains elements of infinite order, then V contains elements of infi- 
nite order, or the orders of the elements in F(V) are not bounded. 

(iii) If the orders of the elements in F(G) are bounded, and if r(G, 0) is a 
finite positive integer, then V contains elements of infinite order and 1 <r(G, 0). 

(iv) If the orders of the elements in F(G) are bounded, and if r(G, 0) is an 
odd (finite) number, then V contains two independent elements of infinite order 
(1<r(V, 0)). 

(v) If G=F(G), F(G, p) #0, and the orders of the elements in F(G, p) are 
bounded, F(G, p) contains at least two independent elements of maximum order. 

(vi) If G=F(G), if the orders of the elements in F(G, p) are bounded, if 
r(G, pit*) is finite for O<k, and if r(G, p*) is odd, then V contains two inde- 
pendent elements of order (1 <r((p*V),). 


The following corollary is easily derived from this theorem: 
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CorRoLiarY. The direct sum G of cyclic groups is isomorphic to the central 
quotient group of a suitable group if, and only if, 

(a) r(G, 0)=1 implies that the orders of the elements in F(G) are not 
bounded; 

(b) G=F(G) and r(G, p*) =1 imply that G contains elements of order p**'. 


For (a) and (b) are part of the conditions (iii) and (v), respectively, and 
if G satisfies the conditions (a) and (b), then it is fairly obvious how to con- 
struct a group V such that G and V satisfy the conditions (i) to (vi) of the 
existence theorem. 


UNIQUENESS THEOREM. If G and V are abelian groups, and if G may be 
generated by a finite number of its elements, then the following conditions are 
necessary and sufficient for the existence of one, and essentially only one, group 
whose central 1s V and whose central quotient group is isomor phic to G: 

(1) If p is a prime number such that G contains elements of order p, then 
V =pV and V contains elements of order p. 

(2) If G contains elements of infinite order, then V=nV for every positive 
integer n, and V contains elements of infinite order. 

(3) If r(G, 0) #0, then 1 <r(G, 0) <4. 

(4) If r(G, 0) =3, then F(G) =0, F(V) =0, and 2=r(V, 0) (=r(V)). 

(5) If r(G, 0) =2 and F(G, p) 0, then F(G, p) is a direct sum of an odd 
number of isomorphic cyclic groups, and 1=r(V,). 


(6) If G=F(G) (that is, if G is a finite group), if F(G, p) 0, and if 
2<r(V,), then F(G, p) is a direct sum of two isomorphic cyclic groups. 

(7) If G=F(G), F(G, p) #0, and 2=r(V,), then F(G, p) is a direct sum of 
two cyclic groups of order pti and one cyclic group of order p* where 0 <i, 
0<j, 0<i+j. 

(8) If G=F(G), F(G, p) #0, and 1=r(V,), then F(G, p) is a direct sum of 
two isomor phic groups. 


Remark. If V and G satisfy condition (1), then at least one of them is 
infinite. 

The proofs of these theorems will occupy us throughout this paper. In 
fact we shall prove slightly more than is needed for these theorems. Thus it 
will be possible to derive, from the facts presented in the following sections, 
the following statement which shows that the hypothesis of the existence of a 
finite basis in G is not a necessary one (though it is needed): 

If G is a direct sum of finite cyclic groups of order n, if r(G) = No, and if 
V =nV and V, ts a cyclic group of order n, then there exists one, and essentially 
only one, group whose central is V and whose central quotient group is isomor phic 
to G. 
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3. The proofs of the theorems enunciated in §2 are based on the following 
theorems which transform the “metabelian” problems into abelian problems 
and which have been proved by the author in an earlier paper.* The following 
definitions are found necessary for the statement of these transformation 
theorems: 

If G and V are two abelian groups, then the operation xy is called a multi- 
plication of G in V if it obeys the following rules: 

(3.1) If x and y are elements in G, then xy is a uniquely determined element 
in V. 

(3.2) O=xx=xy+ yx, x(y+2) 

A multiplication xy of G in V may be called a proper multiplication of G 
in V, if it satisfies the condition: 

(3.3) wa =0 for every x in G if, and only if, w=0. 

We define an admissible set of functions of G in V to consist of a proper 
multiplication «y of G in V and of functions P(n, x) which satisfy the follow- 
ing conditions: 

(3.4) If G contains elements of order n, and if x is an element in G,, then 
P(n, x) is a uniquely determined element in V (mod nV). 


(3.5) If G contains elements of order n, and if x and y are elements in Gn, 
then 


P(n, x + y) = n(n — 1)2- xy + P(n, x) + P(n, y). 


(3.6) If m is a positive integer, G contains elements of order nm, and if x 
is an element in G,, then 


P(nm, x) = mP(n, x). 
(3.7) If G contains elements of order nm and if x is an element in Gum, then 
P(n, mx) = nV + P(nm, x). 
The following theorem constitutes a transformation of the existence theo- 
rem: 


THEOREM.} There exists a proper multiplication of the abelian group G in the 
abelian group V if, and only if, there exists a group whose central is isomorphic 
to V and whose central quotient group is isomorphic to G. 


The following is a transformation of the uniqueness theorem: 


*R. Baer, Groups with abelian central quotient group, these Transactions, vol. 44 (1938), pp. 
357-386. 
t Baer, ibid., Corollary 2.3. 
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THEOREM.* Any two groups whose centrals are isomorphic to the abelian 
group V and whose central quotient groups are isomorphic to the direct sum G 
of cyclic groups are isomorphic if, and only if, there exists for any pair 


ay, P(n, x) and xoy, Po(n, x) 


of sets of admissible functions of G in V an automorphism > of V and an auto- 
mor phism vy of G such that 


(xy)* = x70 y? and P(n, x)* = Po(n, x). 


4. The relations of the conditions which are involved in the definition of 
a multiplication xy of the abelian group G in the abelian group V may be 
analyzed as follows: 

If 0=xyv+yx is always satisfied, then 2xx =0 is inferred by putting x=y. 
If xx=0 and x(y+z) =xy+<4z2, as well as (y+z)x=yx+<2x (this would be a 
consequence of xy+yx=0!), are satisfied, then 0= (x+y) (x+y) =xx+ay+yx 
+yy=xy+yx. 

If, finally, xy is a proper multiplication of G in V, then the hypothesis 
that G is an abelian group is a consequence of the other conditions. 


(4.1) If xy is a multiplication of G in V, and if u is an element in G,, then 
ux and xu are elements in V , for every x in G. 


For multiplications are associative with regard to multiplication by in- 
tegers, that is, (xy) =(nx)y=x(ny). 

If xy is a multiplication of Gin V,then to every element g in G there corre- 
sponds the homomorphism of G into V which is defined by mapping the ele- 
ment x in G upon the element gx in V. To the sum of two elements in G there 
corresponds the sum of the corresponding homomorphisms, and to any two 
different elements in G there correspond different homomorphisms of G into 
V if, and only if, xy is a proper multiplication of G in V. Thus every multi- 
plication of G in V defines a representation of G as a group of homomorphisms 
of G into V, and this representation is a true one if, and only if, the multi- 
plication is a proper one. 

5. We prove the following statement: 


(5.1) If xy ts a multiplication of G in V, and if u and v are elements in G 
such that u and uv have the same order, then u and v are independent elements 
in G. 

Proof. Suppose that / and & are integers such that hu=khv. If neither h 
nor k is 0, then let d be the g.c.d. of # and k. There exist, therefore, integers 


* Baer, ibid., Theorem 6.2. 
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h’, k’ such that d=hh'+kk’, and it follows that 
d(uv) =hh'uv+kk'uv=h' (hu)o+k'u(kv) =h'kov+k'huu=0. 


Hence d is a multiple of the order of uv, and, since u and wv have the same 
order, this implies du =0 and therefore hu =0 provided u and wz are of finite 
order; whereas the above equation leads to a contradiction if « and uv are 
both of infinite order. Thus u and v are independent. 


(5.2) If xy is a proper multiplication of G in V, then there exists correspond- 
ing to every element u in F(G) an element u' in G such that u and uu’ have the 
same order; and if u is an element of infinite order such that ux is an element of 
Jinite order for every x in G, then the orders of the elements ux for x in G are not 
bounded. 


Proof. Suppose that all the elements ux for x in G are of finite order and 
that the orders of the elements ux for x in G are bounded. Then the l.c.m. 
of the orders of the elements uz is a finite positive integer m. It follows that 
0 =m(ux) = (mu)x for every x in G, consequently mu =0 by (3.3). Clearly m 
is the order of u, as follows from (4.1) and the definition of m. Since it is 
fairly clear how to prove the existence of an element u’ in G such that m is 
the order of uw’, this completes the proof. 


(5.3) If xy is a multiplication of G in V, and if b’ and b’b’’ have the same 


order p”, then b'’A0 (mod pG). 


Proof. If b’’= pb, then b’b’’=p(b’b); consequently p™-1(b’b’’) =(pb’)b 
=0. 

6. If xy is a multiplication of G in V, and if S is a subgroup of G and T 
is a subgroup of V, then xy induces a multiplication of S in V/T. But clearly 
this induced multiplication need not be a proper multiplication of S in V/T, 
even if the inducing multiplication is a proper one. 


(6.1) If xy is a proper multiplication of G in V, if m is a positive integer 
such that mG contains multiples, not zero, of every non-zero element in G, and 
if T is a subgroup of V such that 0 is the cross cut of T and mV, then xy induces a 
proper multiplication of Gin V/T. 


Proof. If w+0 is an element in G, then there exists an integer k and an ele- 
ment « in G such that kw=mu +0. Since xy is a proper multiplication of G 
in V, there exists an element w’ in G such that 0*kww’ =muw’. Since 0 is 
the only element contained in T as well as in mV, it follows that kww’, and 
consequently ww’, are not elements in 7; and this implies that xy defines a 
proper multiplication of G in V/T. 


$ 
3 
a 
= 
£ 


1938] CENTRAL AND CENTRAL QUOTIENT GROUPS 393 


(6.2) If xy is a proper multiplication of G in V, if G is generated by its 
subgroups G; and Ge, and if there exists a positive integer m such that mx,x2=0 
for x; in G; and such that mG, contains multiples, not zero, of every non-zero ele- 
ment in G, then xy defines a proper multiplication of G; in V. 


Proof. If w0 is an element in Gi, then there exists an integer k and an 
element w’ in G; such that kw =mw’ #0. Since xy is a proper multiplication 
of G in V, there exists an element u in G such that kwu+0. Since G is gen- 
erated by G, and Gs, there exist elements u; in G; such that u=%,+.2. Con- 
sequently, =kwm+mw'u,=kwm, since w’ is an ele- 
ment in G;. Hence wu, 0; and xy defines, therefore, a proper multiplication 
of G,in V. 

Note that the hypothesis mxx2.=0, for x; in Gi, is satisfied if mG. =0. 

7. If Gis the direct sum of its subgroups G,, and if, for every v, a multipli- 
cation xy of G, in V is given, then there exists one and only one multiplication 
xy of Gin V which induces the given multiplications in the groups G, and satis- 
fies rs =0 for elements r and s which belong to different components G,. This 
multiplication xy of G in V is a proper one if, and only if, the induced multi- 
plications of the groups G, in V are proper multiplications. 

Notation. If xy is a multiplication of G in V, then M(G, xy) is the sub- 
group of V generated by the elements xy for x and y in G; and if B is a sub- 
group of G, and S is a subgroup of V, then (B<G; S, xy) consists of all the 
elements w in G such that wh=0 (mod S) for every 0 in B. 

Thus a multiplication xy of Gin V is also a multiplication of Gin M(G, xy), 
and a proper multiplication of G in V is a proper multiplication of G in 
M(G, xy). (If G is generated by two elements, then M(G, xy) is a cyclic group 
(which may be 0); and if G is generated by a finite number of elements, then 
so is M(G, xy).) It may be noted that (B<G; S, xy) is always a subgroup of 
the group G. 

Lemna 7.1. If xy is a multiplication of G in V, if B is a subgroup of G gen- 
erated by the elements b’ and b’’ such that b’, b’’, and b’b’’ all have the same order, 
and if M(G, xy) is the direct sum of M(B, xy) and its subgroup S, then G is the 
direct sum of (B<G; S, xy) and the cyclic groups, generated by b’ and b’’. 


Proof. It is a consequence of (5.1) that b’ and b’’ form a basis of B, since 
b’, b’’, and b’b”’ all have the same order. If w is an element in the cross cut of 
Band (B<G; S, xy), then wh =0 for every element d in B, since 0 is the cross 
cut of S and M(B, xy). Since xy defines a proper multiplication of B in V, 
this implies that w=0. If finally u is any element in G, then ub’ =s’+r’(b’b’’) 
and ub’’=s’’+r'’(b’b’’), for s’, s’’ in S and r’, r’’ suitable integers. Hence 
u—r’'b'+r’b"’ is an element in (B<G; S, xy), and G is therefore the direct 
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sum of B and (B<G; S, xy). This completes the proof. 


Coro.iary 7.2. If xy is a proper multiplication of the group G with a finite 
number of generators in the cyclic group M(G, xy), then 

(a) G is either a direct sum of a finite number of infinite cyclic groups or a 
direct sum of a finite number of finite cyclic groups; and 

(b) there exists a basis bj , bi’, - - - , bg, bf’ of G with the following proper- 
lies: 

(i) , bf’, and b/ bj’ all have the same order; 
(ii) bf b} bj’ bf’ =0 for ixh; 

(iii) b/ for 1<j <k; and if G is finite, then r; is a divisor of 
the order of bj-1. 

(c) G is a direct sum of two isomorphic groups G’ and G"’ such that 
M(G’, xy) =M(G", xy) =0. 

Proof. (c) is an obvious consequence of (b), and (a) follows easily from 
(5.2) and the fact that the non-zero elements in a cyclic group are either all 
of finite order or all of infinite order. Since M(G, xy) is a cyclic group, there 
exists a pair of elements bj , b/’ in G such that by b/’ generates M(G, xy). Since 
bj bi’ is an element of maximum order in M(G, xy), it follows from (5.2) that 
bj , bi’, and bj by’ all have the same order. If B is the subgroup of G, generated 
- by dy and bj’, then it follows from M(G, xy) = M(B, xy) and Lemma 7.1 that 
G is the direct sum of (B<G; 0, xy) and the cyclic groups generated by bj 


and bj’. Since uwv=0 for u in (B<G; 0, xy) and 2 in B, it follows that xy 
defines proper multiplications of B as well as of (B<G; 0, xy). Since 
M((B<G; 0, xy), xy) is a subgroup of the cyclic group M(G, xy), it is itself 
cyclic and (b) of Corollary 7.2 may be applied to (B<G; 0, xy) since it is 
generated by less elements than G. Now the proof is easily completed by com- 
plete induction. 


Coro.iary 7.3. If Gis a direct sum of So cyclic groups of equal finite order, 
if xy is a proper multiplication of G in V, and if M(G, xy) is a cyclic group, 
then there exists a basis bj , bj’ for 7=1, 2, - - - of G with the following proper- 
lies: 

(i) bf, bf’, and b} b}’ all have the same order. 
(ii) bf bf =b{' bj! by’ =0 for h¥k. 
(iii) bf = --- =b{b{’ = --- generates M(G, xy). 


Proof. Let g:, go, - , gi, be an enumeration of the elements in G. 
Then by complete induction elements }/, b/’ and groups G; will be defined 
for 0<j with the following properties: 

(1) The group B; which is generated by the elements 5} , bj’ with 7 <i has 
these elements as a basis. 


| 
j 
j 
4 
H 


| 


1938] CENTRAL AND CENTRAL QUOTIENT GROUPS 


(2) The elements b/, b}’ satisfy (i) to (iii). 

(3) B; contains the elements g; with 7 <7. 

(4) Gis the direct sum of G; and B;. 

(5) uv=0, if win G; and in B;. 

(6) xy induces proper multiplications of B; and of G; in V. 

Since B,=0, G:=G is a suitable start of this construction, it may be as- 
sumed that elements b/ , b}/’ , for 7 <i and a group G;, have been defined which 
meet the requirements (1) to (6). Then g;=c+d with c in B; and d in G;. 
Since G; is by condition (4) a direct sum of Wo cyclic groups of equal order, 
dis a multiple of an element b/ in G; which is of maximum order in G. There 
exists, by (5.2) and by the fact that M(G;, xy) is a cyclic group, an element 
b/’ in G; such that b/, b/’, and b/b/’ have the same order, and such that 
b/b{’ generates M(G;, xy). Since the orders of the elements in G are finite, 
b{’ may be chosen in such a way that b/b/’ =bj bj’. Finally we may put 
Gis1=(C <G;; 0, xy), where C is the subgroup generated by b/, b/’. Then it 
follows from Lemma 7.1 that the elements b}, b/’, for 7 <<i+1, and the group 
Gis: satisfy (1) to (6). 

The subgroup generated by all the elements b/ , b/’, for7=1, 2, - - - , con- 
tains every element in G by (3); and the b/, b/’ form therefore, by (1), a 
basis of G which satisfies (i) to (iii) by (2). 


Coro.iary 7.4. If G is a direct sum of two infinite cyclic groups and of an 
odd number of cyclic groups of the same finite order n, if xy is a proper multi- 
plication of G in V such that F(M(G, xy)) is a cyclic group, then there exists a 
basis b, b’, b’’, by , bi’, - - - , of G with the following properties: 

(i) bb’=b/ bf’ = --- =bf bf! =c. 
(ii) c, b, b} , and b}’ are elements of order n. 

(iii) b’, b’’, and b’b’’ are elements of infinite order. 

(iv) bb’’ =bb} =bb}’ =b'b} =b''b} =b''b}' bj =b} bf’ =0 
for j#h. 


Proof. Denote by NW the set (F(G) <F(G); 0, xy) of all elements w in 
F(G) such that wx=0 for every x in F(G). Then xy defines a proper multi- 
plication of F(G)/N in the cyclic group F(M(G, xy)). Therefore there exists, 
by Corollary 7.2, a basis by, b{’, - - - , b¢ , b¢’ of F(G) (mod N) which satis- 
fies the conditions (i) to (iii) of (b) of Corollary 7.2. 

Denote by m, u2 any pair of elements in G which forms a basis of G 
(mod F(G)), and let w;, for 7=1, 2, 3 be any three elements in N whose order 
is a prime number # (dividing m). If d is some element of order p in M(G, xy), 
then wju;=7j:d. If iy, he, and h; are not trivial solutions of the two congruences 
in three unknowns (mod p), then w=)-}_,h,w; belongs to N and 


395 

q 

{ 

3 

j 

j 


396 REINHOLD BAER [November 


satisfies wu;=0. Thus w=0, since xy is a proper multiplication of G in V. 
Hence the rank of NV, is at most two, and this implies, together with the exist- 
ence of the particular basis b/ , b/’ of F(G) (mod N) (mentioned before), that 
N, is a cyclic group. N is therefore a cyclic group of order n, and it may be 
assumed, without loss in generality, that the b/ , bj’ satisfy (i). If g generates 
N, then gu;=ric, where c generates F(M(G, xy)). If r is the g.c.d. of r; and re, 
then there exist integers r/ and r/ such that rr +rer/d =r, and it follows from 
(5.2) that r and are relatively prime. (If r2=0, then 17; is relatively prime to 
n, and g may be chosen in such a way that r;=1.) The elements 


, 
Ui = 71%, + Ue, ug = — rer + rir ue 


form a basis of G (mod F(G)) which satisfies gui =rc, gud =0; thus it may be 
assumed without loss in generality that an element b, generating NV, and a 
basis 7, v2 of G (mod F(G)) have been chosen in such a way that bv =c, 
bv. =0. Put finally 


k 
b=%+ > (siz by + 54507), 


j=l 
where the numbers s are determined as solutions of the equations 
vib = Size, vib; = — Size. 
Then a basis of G has been found which meets the requirements of the Corol- 


lary 7.4. 
8. We prove the following lemma: 


Lemna 8.1. If Gis a direct sum of three cyclic groups, if xy is a proper multi- 
plication of G in V, and if M(G, xy) is a direct sum of two cyclic groups, then 
there exists a basis be, bs of G such that =0, whereas and bobs form a 
a basis of M(G, xy). 

Proof. Denote by u, v a basis of M(G, xy), by # the cyclic subgroup, gen- 
erated by u, and by @ the cyclic subgroup, generated by v. Then M(G, xy) 

=t+%. If both u and v are of finite order, it may be assumed that the order 
of u is a divisor of the order of v. (Then it may happen that ~=0.) There 
exists a pair of elements b’, b’’ in G such that »=b’b”’; and it is a consequence 
of (5.2) and the choice of v that b’, b’’, and v have the same order. If B is the 
subgroup of G generated by b’ and b’’, then it follows from Lemma 7.1 that 
b’, b’’ form a basis of B and that G=B+(B<G; a, xy). Clearly (B <G; i, xy) 
is a cyclic group generated by an element c. There exist in B elements b such 
that bc =u and amongst these elements 6 there is one bz such that the homo- 
morphism of B, induced by bz, maps B upon é. It is now clear how to complete 
the proof. 
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9. Two multiplications xy and x o y of G in V are called isomorphic, if 
there exists an automorphism ¥ of G and an isomorphism yp of M(G, xy) upon 
M(G, x 0 y) such that 


(xy)* = x70 y? for x and y inG. 


If the isomorphism » may be chosen in such a fashion that it is induced by 
some automorphism of V, then the two multiplications are termed equivalent. 

The following propositions are fairly obvious consequences of the state- 
ments in §§7 and 8. 

If G is a direct sum of three (but not of two) cyclic groups, then any two 
proper multiplications of G such that M(G, - - - ) are direct sums of two cyclic 
groups are isomorphic. 

If Gis a direct sum of No isomorphic finite cyclic groups, or if G is a finite 
group, then any two proper multiplications with the cyclic M(G, - - - ) are 
isomorphic. 

If G is a direct sum of two infinite cyclic groups and of an odd finite num- 
ber of isomorphic finite cyclic groups, then any two proper multiplications of 
G with cyclic F(M(G, - - - )) are isomorphic. 

It is a consequence of (5.2) that any two proper multiplications of a di- 
rect sum of two cyclic groups are isomorphic. 

If G is a direct sum of a finite number of infinite cyclic groups, and if xy 
is a proper multiplication of G such that M(G, xy) is a cyclic group, then the 
numbers 7; appearing in (iii) of (b) of Corollary 7.2 may be chosen as positive 
integers. Then it is possible to prove that they are invariants with regard to 
isomorphisms. (This proof follows from the consideration of the subgroup 
(G<G;nM(G,xy), xy) of all elements w in G such that wx =0(mod nM (G, xy)) 
for every x in G.) It is then a consequence of Corollary 7.2 that two proper 
multiplications of G with cyclic M(G, - - - ) are isomorphic if, and only if, 
they induce the same invariants 7;. 

10. The object of the next three sections is to construct proper multiplica- 
tions of somewhat elementary groups G. They will be combined afterwards to 
form proper multiplications of the more general types of groups G. 

If B is a basis of the group G, then a multiplication xy of G in V is com- 
pletely determined by the values of the products bb’ for 6 and b’ in B, and 
these values may be chosen completely at random with the two restrictions 
that bb =0 and that bb’ = —b’b. In enumerating the values of the products bb’ 
it may be understood that the product bb’ need not be mentioned, if the value 
of b’b has been given, and that bb’ =0, if the value of neither 6b’ nor b’b has 
been mentioned. 

11. We prove the following proposition: 
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(11.1) If G is a direct sum of finite cyclic groups, if G is a direct sum of two 
isomorphic groups, and if V contains elements of order n whenever G contains 
elements of order n, then there exists a proper multiplication of G in V. 


Proof. There exists a basis of G, consisting of pairs b/, 6/’ such that 6, 
and b/’ always have the same order. There exists in V an element c, such 
that b/, b/’, and c, have the same order. Then a proper multiplication of G 
in V is determined by the rule b/ b/’ =c,, for every 2. 


(11.2) If Gis a direct sum of cyclic groups whose orders are powers of a fixed 
prime number p, the orders of the elements in G are not bounded, and V contains 
elements of every order p‘, then there exists a proper multiplication of G in V. 


Proof. G=G’+G’’+G"’’, where G’ and G”’ are isomorphic groups and 
where G’”’ has a basis };, be, - - - ,b;, - - - such that the orders of the elements 
b; are not bounded and such that the order of 5;_, divides the order of 0. 
There exists in V an element c; whose order equals the order of 6;. A proper 
multiplication of G’’’ in V is characterized by the equations ),b;;:=c;, for 
i=1,2,---. By (11.1) there exists a proper multiplication of G’+G”’ in V, 
consequently there exists a proper multiplication of G in V. 


(11.3) If G is a direct sum of two cyclic groups of order p” and of a finite 
number of cyclic groups whose orders are divisors of p", and if V contains ele- 
ments of order p" and contains two independent elements of order p‘ in case 
r(G, p*) is odd, then there exists a proper multiplication of G in V. 


Proof. This is a consequence of (11.1) if all the numbers r(G, p*) are even. 
If not all the numbers 7(G, p‘) are even, then there exists a greatest integer m 
such that r(G, p”) is odd and 0<m<n. V contains an element u of order p” 
and an element v of order p” which is independent of u. There exists a basis 
bi, b2,--- of G such that the order p" of 0b; satisfies the inequality 
n(j) Sn(j—1) and consequently =n(1) = (2). If n=m, then is of order 
and in this case we put 4=3. If m<n, then let # be the smallest number such 
that n(k) =m. The number / thus defined is in both cases an odd integer and 
n(2j —1) =n(2j), for 0<2j <h. 

It is now easily verified that a proper multiplication xy of G in V is char- 
acterized by the equations 


be j—-1b2; pr for 0< 27 < h, 
bj-1b; = pm", for AS j. 
(11.4) If G is a direct sum of an infinity of cyclic groups of order p” and of 


cyclic groups whose orders divide p", and if V contains elements of order p”, 
then there exists a proper multiplication of G in V. 
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Proof. There exists a basis };, be, - - - , by, - - - of G which is well ordered 
by the subscripts of its elements bd, in such a fashion that there is no last 
element in the basis and that the order p* of 5, divides p" for v<k. 
There exists an element u of order * in V. 

A proper multiplication of G in V is characterized by the equations 


= 


(11.5) Suppose that G is a direct sum of two cyclic groups of order p” and 
of a finite number of cyclic groups whose orders divide p" and that V contains 
an element of order p” and two independent elements of order p‘, if r(G, p*) is 
odd. Then there exist two non-isomor phic, proper multiplications of G in V if 
one of the following conditions is satisfied: 

(a) 2<r(G,),1<r(V,), and G is a direct sum of two isomorphic groups. 

(b) 2<r(G,), 2<r(V,). 

(c) 3<r(G,) and G is not a direct sum of two isomorphic groups. 


Proof. If (a) is satisfied by G and by V, then there exists a basis 
bi, be, - - - , be, such that 1<& and such that be;-1 and ba; have the same 
order p"‘®. V contains an element u of order p* and an element w of order p 
which is independent of u. If h=0 and if h=1, then a proper multiplication 


xoy 
A 


of G in V is defined by the equations 


bos = pr Ou, bai 0 = hw 


and 


xoy and xoy 
0 1 


are clearly non-isomorphic multiplications of G in V. 

If (b) is satisfied, then it may be assumed that G is not a direct sum of two 
isomorphic groups, since otherwise (a) might be applied. Then there exists 
a greatest integer m such that r(G, p”) isodd and 0<m <n. V contains a sub- 
group V’ which is a direct sum of a cyclic group of order p* and of a cyclic 
group of order p”; and V contains an element w of order p which is not con- 
tained in V’. 

There exists by (11.3) a proper multiplication xy of G in V’, and it is a 
consequence of Corollary 7.2 that M(G, xy) is not a cyclic subgroup of V’. 
Denote by x@y the multiplication of G in V which is characterized by the 
equations 


405, = wv, 
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where },, 5; is a pair of elements in a basis of G such that 5, =0. (That it is 
possible to determine xy and a basis of G in such a fashion may be verified by 
looking over the proof of (11.3).) Then 


xoy = (xy) + (x © y) 
is a proper multiplication of G in V which satisfies 
2 = (MG, xy)») < 3 = xo y)»); 


consequently xy and xo y are non-isomorphic proper multiplications of G 
in V. 

Assume now that the conditions of (c) are satisfied. Then there exists a 
greatest integer m such that r(G, p”) is odd and 0<m<n. Furthermore V 
contains an element u of order p” and an element 2 of order p” which is inde- 
pendent of wu. 

It is finally possible to decompose G in the form 


k 
G=G' +6" + DZ), 
jut 
where G’ and G”’ are isomorphic groups and the maximum order of the ele- 
ments in G’ is p", and where Z(j) is a cyclic group of order p*, 1<k; 
O<n(k)< +--+ <n(1)=m<n. 

Two cases may be distinguished. 

Case 1. G’ (and therefore G’’) is a cyclic group of order p”. Since G is 
not a direct sum of three cyclic groups, this implies that 1 is less than k. 
Denote by g’, g’’, and z(7) elements which generate G’, G’’, and Z(j), respec- 
tively. 

A proper multiplication xy of G in V is characterized by the equations 

=u, = 9, 2G — 1)2G) = 
If x=2'g’+2''p’’+)>-\_,x(j)2(j) is any element in G, then 
xg’ = — = «’u — x(1)2, x2(1) = — x(2)p*-"@ u, 
x2(j) = — — + for 1<j<k, 
x2(k) = x(k — nu, 


Denote by W(xy, H) the set of all elements w in the subgroup H of G for 
which the set wG of all the elements wx for x in G is a cyclic subgroup of V. 
(This notation shall be used throughout this proof.) Clearly the element w 
in H belongs to W(xy, H) if, and only if, the elements wg’, wg’’, wz(j), for 
1<j<k, generate a cyclic subgroup of V. It may now be computed that 
W (xy, G) consists exactly of the elements of the form 
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k 
j=2 
and of the elements which may be represented in the form 
+ + 1), 
j=0 


where k = 2i+1, h(j) = (m(27 +2) —n(2j7+3))+ - - - +(m(k—1) —n(k)), or the 
elements which may be represented in the form 


i-1 
+ + 1), 
j=0 


where k = 2i, h(j) = (m(27 +2) —n(27+3))+ - - - +(n(k—2) —n(k—1))+n(k). 
Another proper multiplication x o y of Gin V is characterized by the equa- 
tions 
gog™ =u, glo2(k) = 
2G — = p™"u, for 1<jsk. 


Suppose now that w=w’g’+w’’g’’+)>°\_,w(k)z(k) is an element of Gy. 
Then 
wog’ = — — w(k)p™*, wog”’ = w'u — w(1)2, 
woz(1) = — w(2)p*-* u, 
woz(j)= — wf t+i1)p "Fu, for 1<j<k, 
wo2(k) = 0, 


since n(k) <n(k—1) <m<n. Consequently W(x o y, Gy) consists exactly of 
those elements which may be represented in the form 


k-1 


j=2 


if m <n, and which may be represented in the form 


k-1 
prtw'g! + or pw'g’ + 
j=2 
ifn=m. 
Thus W (xy, G,) and W(x 0 y, G,) are of essentially different structure, and 
the two proper multiplications xy and x o y of Gin V are not isomorphic. 
Case 2. G’ (and therefore G’’) is not a cyclic group. Then there exists a 
basis bi, - - - , of G’, - - - , bf’ of G’’ such that 1 <z, b/ and b/’ have 
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the same order p™? and m(i)< - - - Sm(1)=n. Proper multiplications xy 
and x o y of Gin V are characterized by the equations 


bj bj = bi2(1) = 2, 2(j — 1)2(j) = 
for 1<j<k and by the equations 
bj obj! = pr mu, bi o2(1) = 2, 2(j — 1) 02(7) = 
for 1<j<k, o 2(1)=p"-w, where g = min (m/(2), (Note that 
1<k, and that k=1 isa possibility.) 


The elements in W(xy, G) are exactly the elements contained in the fol- 
lowing two sets A and B: A consists of the elements of the form 


pmw(1)’by + + > + w(j)"bj’) + 


j=2 
and B consists of the elements of the form 


w(i)"b{’ + if k=1, 
w(1)"by + hz if 1<k, 
where 


= (n(2j + 2) — n(27 + 3)) +--+ + (n(k — 1) — n(k)) 
if k=2i+1; and 


i-1 


j=0 


hj) = (n(2j + 2) — n(2j + 3)) +--+ + (n(k — 2) — n(k — 1)) + n(k) 


if k= 23. 

In order to show that W(x o y, G) is essentially different from W(xy, G) 
and that consequently the two proper multiplications xy and x o y of Gin V 
are not isomorphic, the following remark will suffice: 


A’ W(xo y,G) (A’, B’), 


where (A’, B’) may denote the set-theoretical join of the sets A’ and B’. 
Here A’ consists of the elements 


+ (w(2)'p* — + + w(2)"4" 


i k 
+ (w(j)’b} + + 


j=3 j=2 
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and B’ consists of the elements 
+ (w(2)’p2 — w(1)’)bs + hz(1), 
if k=1, and of the elements 
w(1)’p™-%by + (w(2)’p2 — w(1)’)b2 + hz, 


if 1<k, where z has the same meaning as in the computation of W(xy, G). 
12. We prove the following statement: 


(12.1) If G is a direct sum of infinitely many infinite cyclic groups, and if 
the orders of the elements in F(V) are not bounded, then there exists a proper 
multiplication of Gin V. 


Proof. Clearly it is sufficient to prove the statement for countable groups 
G. Then there exists a countable basis B of G, and the elements of such a basis 
B may be denoted by (i, - - - , 7) where » and i; run over all the positive 
integers. There exists furthermore to every positive integer i an element c(?) 
in V whose order exceeds i. Then a proper multiplication of G in V is charac- 
terized by the equations 

in) (41, in41) = C(in41)- 

(12.2) If Gis a direct sum of infinite cyclic groups, if r(G, 0) =r(G) is not an 
odd finite number, and if V contains elements of infinite order, then there exists a 
proper multiplication of G in V. If 2<r(G, 0), then there exists a proper multi- 
plication xy of Gin V such that (G<G;2M(G, xy), xy) =2G and a proper multi- 
plication x o y of Gin V such that 2G <(G<G; 2M(G, x0 y), x0 y); and these 
two proper multiplications of G in V are not isomorphic. 

Proof. There exists a basis of G which consists of pairs b/, b/*. There 


exists furthermore an element w of infinite order in V, and a proper multipli- 
cation of G in V is characterized by the equations 


by’ =u 
for every v. Clearly 
2G = (G <G; 2M(G, xy), xy). 
If 2<r(G, 0), then a proper multiplication of G in V is characterized by 
the equations 


biobi =u, bf = 2u 


for every v¥1. Since (G<G; 2M(G, x oy), xo y) is generated by the elements 
, 2b/’, , for v1, it follows that 2G<(G<G;2M(G,xoy),xoy). 
This completes the proof. 
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(12.3) If G is a direct sum of three infinite cyclic groups, and if V is a direct 
sum of two infinite cyclic groups and one cyclic group (which may be 0, finite, or 
infinite), then there exists a proper multiplication xy of G in V such that 
M(G, xy) =V. 

Proof. There exists a basis b’, b’’, b’’’ of G, and there exists a basis u’, u’’, 
v of V such that u’ and w”’ are elements of infinite order. Then a proper multi- 
plication of G in V is characterized by the equations 


b’b”’ = u’, = v; 


and clearly V=M(G, xy). 

(12.4) If G is a direct sum of four infinite cyclic groups, and if V is a direct 
sum of an infinite cyclic group and one cyclic group, then there exists a proper 
multiplication xy of Gin V such that M(G, xy) =V. 


Proof. There exists a basis b;, - - - , bs of G and a basis u, v of V, where u 
may be an element of infinite order. A proper multiplication xy of G in V such 
that V = M(G, xy) is characterized by the equations },b2=b3b,=1, 2b; =v. 

13. We prove the following statement: 

(13.1) If G is a direct sum of cyclic groups, if G#F(G), if the orders of the 
elements in F(G) are not bounded, and if V contains elements of order n whenever 
F(G) contains elements of order n, then there exists a proper multiplication of G 
in V. 

Proof. G=F(G)+U’+U"’, where U’ is a direct sum of a finite number 
(not zero) of infinite cyclic groups and where U”’ is either 0 or a direct 
sum of an infinity of infinite cyclic groups. Since the orders of the elements 
in F(V) are not bounded, it follows from (12.1) that there exists a proper 
multiplication of U’’ in V. Therefore it may be assumed without loss in 
generality that r(G, 0) is finite. 

If r(G, 0) is finite, then G is a direct sum of r(G, 0) groups H(i) such that 
the orders of the elements in F(H(i)) are not bounded and such that 
r(H(i), 0)=1. Thus it may be assumed without loss in generality that 
r(G, 0) =1. 

F(G) is a direct sum of two groups G’ and G”’ such that G”’ is a direct 
sum of two isomorphic groups and such that the orders in G’ are not bounded 
and all the numbers r(G’, p*) are finite. Since there exists by (11.1) a proper 
multiplication of G’’ in V, it is sufficient to prove the existence of a proper 
multiplication of G’+a in V, where u is an element in G which generates 
G (mod F(G)) and where @ is the group generated by wu. 

If F(G’, p) #0, then there exists a basis - , Dip, - Of F(G’, p) 


7) 
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such that the order of 5;_:, is not greater than the order of b;,. There exists, 
furthermore, in V an element 2;, of the same order as )j,. 

A proper multiplication of G’+4a in V is characterized by the following 
equations 


bi-tpbip = Vi-19, 
bipt = Vip 
for every i, if F(G’, p) is infinite, and 
bi-1pdip = Vi-1p, = Ump, 
if F(G’, p) is finite and m=r(F(G’, p),). 


(13.2) If Gis a direct sum of a finite number of cyclic groups, if 1<r(G, 9), 
if V contains elements of the finite order n whenever F(G) contains elements of 
order n, if V contains elements of infinite order, and if V contains two independ- 
ent elements of infinite order, in case r(G, 0) is odd, then the following proposi- 
tions are true: 

(1) There exists a proper multiplication xy of G in V with the properties: 

(i) (F(G, p) <F(G, p); 0, xy) is a cyclic group which is 0 if, and only if, 
F(G, p) is a direct sum of an even number of isomorphic cyclic groups; 

(ii) M((F(G) <G; 0, xy), xy) does not contain non-zero elements of finite 
order. 

(2) There exists a proper multiplication xy of G in V with the property: 

(iii) (F(G, p) <F(G, p); 0, xy) is a cyclic group, not zero, if F(G, p) is a di- 
rect sum of an odd number of isomorphic cyclic groups, and it is a direct sum of 
two cyclic groups, not zero, otherwise, if F(G, p) 0. 

(3) If F(G, p) 0, and if V contains two independent elements of order p, 
then there exists a proper multiplication xy of G in V such that M(G, xy) con- 
tains two independent elements of order p. 

(4) If V contains elements of finite order, and if 2<r(G, 0), then there exists 
a proper multiplication xy of G in V such that M((F(G) <G; 0, xy), xy) con- 
tains non-zero elements of finite order. 


Proof. There exists a basis bi, be, - - - , b, of G (mod F(G)) such that 
1<h=r(G, 0); and there exists a basis dip, ---, dip of F(G, p) such that 
0<k=k(p) (that is, k(p) =0, if F(G, p) =0) and such that the orders p* of 
b;, satisfy the inequality 0 <;,<nj;-1,, for0<j—1<k. V contains an element 
v(p) of order p™» if F(G, p) #0, an element x of infinite order, and, if r(G, 0) 
is odd, an element 2 of infinite order which is independent of wu. 

A proper multiplication «y of G in V is characterized by the following 
equations: 


| 
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bibs = = = 4; 
= 97 

if, and only if, # is odd; and 

babi, = W, 
if 2<h, where w is an element of finite order in V which shall be kept indeter- 
minate for the moment. If F(G, p) #0, then 

= v(p), Dipbe = w(p), 


where w(/) is either 0 or an element of order p which is independent of »(f), 


and 
bj-1pbip = p™0(p) with mip = Mp — Nip. 


If w=0 and w(p) 0, then xy meets the requirements of (3). If w(p) =0, 
then (F(G) <G; 0, xy) is the direct sum of (F(G) <F(G); 0, xy), the cyclic 
groups generated by the elements 0; for 1 <i, and the cyclic group generated 
by where [],p"*. The requirements of (4) are therefore satisfied if 
w 0; and condition (ii) is satisfied if w=0. 

Suppose now that w=w(p) =0. Every element of F(G, p) has the form 


k(p) 
x=) consequently 


= — Xop™r(p), = — 
for 1<j<k(p), and 
= 
Thus x belongs to (F(G, p) <F(G, p); 0, xy) if, and only if, 
= Xj41 (mod = O (mod 


Consequently (F(G, p) <F(G, p); 0, xy) is the cyclic group generated by 


i-1 
7=0 
with 
hop = Naip, * hip = + (M2i-2p — Mas-1p) + * + — 


if k(p) =2i, and by 


j=0 


¥ 

4 

( 

= 
with 
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if k(p) =2i+1. It is now obvious that this multiplication xy satisfies (i). 
In order to prove (2) we will consider the multiplication x 0 y of G in V 
which is characterized by the equations 


b10 be = b3004 = 


if and only if / is odd; if F(G, p) #0, then 


bip vo(p), bj-1p O Djp pmirr(p), for j < k(p), 
0 be = p™orr(p), if 1 < R(p). 


If k(p) =1, then (F(G, p) <F(G, p); 0, xo y) is the cyclic group generated by 
bip. If 1<k(p), then (F(G, p)<F(G, p); 0, x o y) is the direct sum of the 
cyclic group, generated by )di:p)p, and the group W(p), consisting of all the 
elements w in the group B, generated by the b;, with 7 <k(p), which satisfy 
wx =0 for every x in F(G, p). But W(p) =(B<B; 0, x o y), and in the group, 
generated by B and the b; for 7 <k(p), the multiplication is of the same type 
as considered in the first part of the proof. Thus W(p) is a cyclic group, 
and W(p) =0 if, and only if, B is a direct sum of an even number of isomorphic 
cyclic groups. This completes the proof. 

14. Proof of the existence theorem. If G is a direct sum of cyclic groups, 
and if V is an abelian group, then it follows from the transformation of the 
existence theorem (§3) that the existence theorem is equivalent to the follow- 
ing proposition: 

There exists a proper multiplication of G in V if, and only if, the conditions 
(i) to (vi) of the existence theorem are satisfied. 


Suppose first that there exists a proper multiplication xy of G in V. Then 
(i) and (ii) are consequences of (5.2). Suppose now that the orders of the ele- 
ments in F(G) are bounded and that r(G, 0) is a finite positive integer. Then 
G=F(G)+U, where U 0 is a direct sum of a finite number of infinite cyclic 
groups. If m is the finite maximum order of the elements in F(G), then muv =0 
for u in F(G) and v in U, and it is now a consequence of (6.2) that xy induces 
a proper multiplication of U in V. Since U is generated by a finite number of 
elements, M(U, xy) is also generated by a finite number of elements, and 
F(M(U, xy)) is therefore a finite group. Hence xy induces by (6.1) a proper 
multiplication of U in M(U, xy)/F(M(U, xy))=V*, and all the non-zero 
elements in V* are of infinite order. Now it follows from (5.2) that V*0, 
that is, that V contains elements of infinite order, and it follows from (5.1) 
that U contains at least two independent elements, that is, 1<r(G, 0). If 
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furthermore r(G, 0) is an odd finite number, then it follows from Corollary 
7.2 that V* is not a cyclic group, and consequently that V contains at least 
two independent elements of infinite order. Thus the necessity of the condi- 
tions (iii) and (iv) has been verified. 

If G=F(G), then G=F(G, p)+F’(G, p), where F’(G, p) is the direct sum 
of all the F(G, q) for and uv =0 for u in F(G, p) and vin F’(G, p). Thusa 
multiplication xy of G in V is a proper multiplication if, and only if, xy in- 
duces a proper multiplication of every F(G, p) in V. 

Suppose now that «xy is a proper multiplication of F(G, p) in V and that 
the orders of the elements in F(G, p) are bounded. Then (v) is a consequence 
of (5.2) and (5.1). Suppose now that F(G, p)=F’+F”, where p*'F’ =0, 
whereas F” is a direct sum of cyclic groups whose orders are multiples 
of p‘. Then xy induces by (6.2) a proper multiplication of F’”’ in V. Clearly 
M(F”, xy) <F(V, p). If F(V, )=V’+V”, where p*"V’=0, V” contains 
exactly one cyclic subgroup of order p‘, then it follows from (6.1) that xy 
induces a proper multiplication of F” in F(V, p)/V’~V”. If F” is a finite 
group, it follows from Corollary 7.2 that F’”’ is a direct sum of two isomorphic 
groups. This shows the necessity of condition (vi). 

Assume now that the direct sum G of cyclic groups and the abelian group 
V satisfies the conditions (i) to (vi) of the existence theorem. 

Case 1. F(G) ¥G. If the orders of the elements in F(G) are not bounded, 
there exists, by (13.1) and (i), a proper multiplication of G in V. If the orders 
of elements in F(G) are bounded, then G=F'’+F’’+U, where F’ is a finite 
group, F”’ is a direct sum of two isomorphic groups without elements of in- 
finite order, and U is a direct sum of infinite cyclic groups. If V does not con- 
tain elements of infinite order, then U is, by (iii), a direct sum of an infinity of 
infinite cyclic groups, and the orders of the elements in F(V) are not bounded, 
by (ii). There exists, by (12.1), a proper multiplication «y of U in V and there 
exists, by (11.1), a proper multiplication xy of F’’ in V. Consequently there 
exists a proper multiplication of F’’+U in V. F’ has, as a finite group, a 
basis of the form },, - - - , b., where the order of 5;_ is a divisor of the order 
of b;. Denote by v; an element in V whose order equals the order of b; and by 
Uo, 1, - - - a basis of U. Then the proper multiplication of F’’+U in V is ex- 
tended to a proper multiplication of G in V by the equations 


= 0, = Uk, b jy = 0, for v 0, 
bf” =0 for f” in 
If the orders of the elements in F(G) are bounded, then G=F’+U’+W, 


where W is a direct sum of two isomorphic groups, F’ a finite group, and U’ 
a direct sum of two or three infinite cyclic groups; and if r(U’, 0) =3, then 
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r(W, 0) is finite. If the orders of the elements in F(V) are bounded, then V 
contains elements of infinite order, and if r(U’, 0) =3, then r(G, 0) is an odd 
finite number and V contains, by (iv), two independent elements of infinite 
order. Thus there exists, by (13.2), a proper multiplication of F’+U’ in V; 
and by (11.1), (12.2) there exists a proper multiplication of W in V. Con- 
sequently there exists a proper multiplication of G in V. 

Case 2. F(G) =G. Since F(G) is the direct sum of the groups F(G, ), it is 
sufficient to construct proper multiplications of F(G, p) in V. If the orders of 
the elements in F(G, p) are not bounded, there exists by (11.2) a proper 
multiplication of G in V. If the orders of the elements in F(G, p) are bounded, 
either F(G, ) is a finite group and the existence of a proper multiplication of 
F(G, p) in V is a consequence of (11. 3), or F(G, p) is a direct sum of an 
infinity of cyclic groups of order p* and some cyclic groups of lower order 
and the existence of a proper multiplication of F(G, p) in V is a consequence 
of (11.4), or finally F(G, p) is a direct sum of a group F’ and a group F”’, 
where F’ is a direct sum of a finite number of cyclic groups whose orders are 
multiples of " whereas F”’ is a direct sum of an infinity of cyclic groups 
of order p*—! and cyclic groups of lower order. In this last case there exists a 
proper multiplication of F(G, p) in V, since there exists by (11.3) a proper 
multiplication of F’ in V and by (11.4) a proper multiplication of F’’ in V. 

15. Theorem 15.1. We prove the following theorem: 


THEOREM 15.1. Assume that V is an abelian group, and that G is a direct 
sum of a finite number of cyclic groups. Then there exists one and, apart from 
isomor phic ones, only one proper multiplication of G in V if the following condi- 
tions are satis fied: 

(A) If r(G, 0) #0, then 1 <r(G, 0) <4. 

(B) Ifr(G, 0) =3, then F(G) =F(V) =0 and 2=r(V, 0) (=r(V)). 

(C) If r(G, 0) =2 and F(G, p) #0, then F(G, p) is a direct sum of an odd 
number of isomorphic cyclic groups, and V contains elements of infinite order and 
exactly one cyclic subgroup of order p*, where p* is the order of the cyclic direct 
summands of F(G, p). 

(D) If G=F(G) (that is, if G is a finite group), if F(G, p) #0 and if 
2<r(V,), then F(G, p) is a direct sum of two isomorphic cyclic groups. 

(E) If G=F(G), F(G, p) #0, and 2=r(V,), then F(G, p) is a direct sum of 
two cyclic groups of order p” and one cyclic group of order p™ with O<m<Sn, 
0 <n, and V contains two independent elements of order p™. 

(F) If G=F(G), F(G, p) #0, and 1=r(V,), then F(G, p) is a direct sum 
of two isomorphic groups. 

(G) If F(G) contains elements of order n, then V contains elements of order n. 
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Proof. If the conditions (A) to (G) are satisfied, then it follows from the 
existence theorem and its “transformation” that there exists a proper multi- 
plication of G in V. That any two proper multiplications of G in V are iso- 
morphic, if the conditions (A) to (G) are satisfied, is a consequence of the 
following propositions: 

(1) Lemma 8.1 if r(G, 0) =3. 

(2) Corollary 7.4 if r(G, 0) =2. 

(3) Corollary 7.2 if G=F(G) is a direct sum of two isomorphic groups and 
r(V,) =1 for every prime number / such that elements of order p are con- 
tained in G. 

(4) Lemma 8.1 if G=F(G) and F(G, p) is a direct sum of at most three 
cyclic groups and r(V,) =2, provided F(G, p) #0. 

(5) (5.1) and (5.2) if G=F(G) and if F(G, p) is a direct sum of two iso- 
morphic cyclic groups. 

Suppose now that there exists one, and apart from isomorphic multiplica- 
tions, only one proper multiplication of G in V. Assume first that F(G) ¥G. 
Then 1<r7(G, 0) since F(G) and r(G, 0) are finite, as follows from condition 
(iii) of the existence theorem. Since furthermore G=F(G)+U+U’, where U 
is a direct sum of an even number of infinite cyclic groups and U’ is either 0 
or an infinite cyclic group, and since every proper multiplication of U in 
an infinite cyclic subgroup of V may be extended to a proper multiplication 
of G in V, it follows from (12.2) that r(G, 0) <3. If r(G, 0) =3, then it follows 
from (13.2), (1) and (4), that F(V) =0, and it is a consequence of condition (i) 
of the existence theorem that F(G) =0. Now condition (B) is a consequence 
of (12.3). If r(G, 0) =2, then it follows from (13.2), (1) and (2), that F(G, p) 
is either 0 or a direct sum of an odd number of isomorphic cyclic groups and 
that r(V,) =1 is a consequence of (13.2), (3). Thus conditions (A) to (C) are 
proved to be necessary. 

Assume secondly that G = F(G) is a finite group. Then the conditions (D), 
(E), (F) are consequences of (11.4) and of the conditions (i), (v), and (vi) 
of the existence theorem which imply also condition (G). 

16. Proof of the uniqueness theorem. If G is a direct sum of a finite num- 
ber of cyclic groups, if V is an abelian group, and if G and V satisfy the con- 
ditions (1) to (8) of the uniqueness theorem, then it is a consequence of 
Theorem 15.1 that any two proper multiplications of G in V are isomorphic 
and that proper multiplications of G in V exist. If G contains elements of in- 
finite order, then V is a direct sum of groups of type* p* and of groups of the 
type of the additive group of all the rational numbers, and if G=F(G), but 


* Groups of type p* contain only elements whose orders are powers of p, and contain, for every 
positive integer 7, exactly one cyclic subgroup of order p’. 
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F(G, p) #0, then F(V, p) is a direct summand of V which is itself a direct sum 
of groups of type p”.* Thus the isomorphisms of relevant subgroups of finite 
rank of V may be induced by automorphisms of V. Since finally all the func- 
tions P(n, x) appearing in the transformation of the uniqueness theorem sat- 
isfy P(n, x) =0 (because V=nV for every relevant m), it follows from the 
transformation of the uniqueness theorem that the conditions (1) to (8) are 
sufficient. 

In order to prove the necessity of the conditions (1) to (8) it is sufficient 
to prove the necessity of the conditions (1) and (2), as may be inferred from 
Theorem 15.1 and the transformation of the uniqueness theorem. V contains 
a subgroup of the form F(V)+U, where U is a direct sum of r(G, 0) infinite 
cyclic groups, and where U may contain any preassigned element of infinite 
order as a basis element. If F(G) ¥G, then there exist proper multiplications 
xy of Gin F(V)+U such that a given basis element of U appears in M(G,xy). 
If V ¥nV for some positive n, then it is always possible to find a proper multi- 
plication xy of G in V such that all elements of infinite order in M(G, xy) are 
elements in mV and to find another one where this is not the case. Thus con- 
dition (2) is also necessary. 

In order to prove (1) we will have to consider the functions P(n, x). For 
every proper multiplication xy of G in V there exist admissible functions 
P(n, x) which have, on a basis of F(G), preassigned values. If P(n, x) =0 for 
every «x of a basis of F(G), then P(n, x) =0 for every odd and every x in Gy, 


and the P(2‘, x) are elements in M(G, xy)2. Thus V = pV for every odd prime 
number such that F(G, p) #0; and if V contains elements of order 4 and if 
F(G, 2) #0, then V =2V. By condition (2) and condition (i) of the existence 
theorem only the following case has to be discussed in order to complete the 
proof of the necessity of condition (1): F(G, 2) is a direct sum of groups of 
order 2 and so is F(V, 2). 

It follows from (6) to (8) that only the following cases are possible: 


I. r(F(G, 2)) =3 and r(F(V, 2)) =2. 
II. 2<r(F(G, 2)) =2iand r(F(V, 2)) =1. 

III. 2=r(F(G, 2)). 

Case I. There exists, by Lemma 8.1, for the given proper multiplication 
xy of Gin V a basis b’, b’’, b of F(G, 2) such that 6b’ and b’b”’ form a basis of 
F(G, 2) and such that bb’’ =0. If P(2, 6) = P(2, b’) = P(2, p’’) =0, then 

P(2,b+0') = bb’, P(2,0' +5") P(2,6" +5) =0, 
P(2, 6 +b’ +5”) = bb’ + 


* Cf. R. Baer, Annals of Mathematics, vol. 37 (1936), pp. 766-781. 
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If on the other hand P’(2, 6) = P’(2, b’) = P’(2, b’’) =bb’, then 
P’(2,b + b’) = bb’, P’(2, + = P’(2, + 6) = 0, 
P'(2,b +B +b”) = 
and these two sets of admissible functions are clearly essentially different, 
unless V=2V. 

Case ITI. There exists, by Corollary 7.2, a basis b/ , b}’ of F(G, 2) such that 
bj b{’ = - - - =bfb¢’ =c and such that all the other products of basis ele- 
ments are 0. The admissible function P(2, x), characterized by P(2, b/) 
= P(2, b{’) =0, has the property that P(2, x) #0 for exactly 2*—! elements x 
in F(G, 2). The admissible functions P’(2, x), on the other hand, which are 
characterized by 


P'(2, bf) = P’(2, bi’) =c, = P’(2, bj ) = 0, 
for 1<j, have the property that P’(2, x) #0 for exactly 2*-1(3-2*-!—1) 


(for 3, if k=1) elements x in F(G, 2). This completes the proof of the neces- 
sity of condition (1) since this treatment of Case II covers Case III too. 


THE UNtvERSITY OF ILLINOIS, 
Urspana, ILt. 


DEFINITELY SELF-ADJOINT BOUNDARY 
VALUE PROBLEMS* 
BY 
GILBERT A. BLISS 

1. Introduction. The boundary value problem to be considered in 
this paper is that of finding a constant \ and a set of functions y,(x) 
(a<x<b;i=1,---, m) satisfying differential equations and boundary con- 
ditions of the form 


(1.1) yi = [A ia(x) + ABia(x) |y, Mavala) + Niaya(b) = 0, 


in which the matrix ||M,.Nia|| is a matrix of real constants of rank n. Re- 
peated subscripts indicate summation, as in tensor analysis, and it will be 


understood that all subscripts have the range 1, - - - , unless otherwise ex- 
plicitly specified. The system 
(1.2) ai = —tal[AaitABail, Pai + 2a(b)Qai = 0 


is by definition adjoint to (1.1) if the matrix of constants ||P.iQzi|| satisfies 
the conditions 


MiaPak — NicQar = 9 (i,k =1,---,m). 


The system which is given in (1.1) is said to be self-adjoint provided that it 
is equivalent to its adjoint system (1.2) by a non-singular transformation 
2=Tia (x) Va. 

This definition of self-adjoint boundary value problems and a further defi- 
nition of so-called definite self-adjointness were given by the author in a paper 
published in 19267 which will be designated in the text below by the Roman 
numeral I. In that paper it was stated that the boundary value problems 
arising from the calculus of variations are all definitely self-adjoint. This 
statement is true for non-singular problems of the calculus of variations with- 
out side conditions, the only ones whose boundary value problems had been 
studied up to that time so far as is known to the writer. It is not true, how- 
ever, for problems of the calculus of variations such as those of Mayer, 
Lagrange, and Bolza whose boundary value problems are self-adjoint but not 
definitely self-adjoint according to the definition given in I. One of the earliest 


* Presented to the Society, April 11, 1936; received by the editors November 3, 1937. 
t Bliss, A boundary value problem for a system of ordinary differential equations of the first order, 
these Transactions, vol. 28 (1926), pp. 561-584. 
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formulations of a case of this more complicated kind was that of Cope for the 
problem of Mayer with variable end points.* 

In the following pages a modification of the earlier definition of definite 
self-adjointness will be given which seems to be applicable to all of the bound- 
ary value problems so far studied arising from problems of the calculus of 
variations involving simple integrals. The new definition involves a property 
analogous to the normality of a minimizing arc for a problem of Bolza, and 
is weaker than the older definition in the sense that it imposes fewer restric- 
tions. It will be shown, however, that for a definitely self-adjoint boundary 
value problem as here defined most of the properties deduced in the paper I 
cited above are still valid. For example, the characteristic numbers are all 
real and have indices equal to their multiplicities, and the expansion theorems 
proved in the paper I also hold. It is not possible to show that the number of 
characteristic numbers is always infinite. Examples will be cited showing that 
this is in fact not the case. When the set of characteristic numbers is finite the 
class of functions for which the expansion theorems hold is of course severely 
limited. The boundary value problems arising from the calculus of variations 
are a special type of definitely self-adjoint problems which have an infinity 
of characteristic numbers, as has been shown by several writers. 

In the paragraphs below frequent use is made of the results and proofs of 
the paper I to which reference has been made above. 

2. The definition of definite self-adjointness and its first consequences. 
It is understood that A ;;(x), Bi.(x) are real, single-valued and continuous on 
a<x<b. The definition fundamental for this paper is then the following: 


Derinition. A boundary value problem (1.1) is said to be definitely self- 
adjoint if it is self-adjoint and has the further properties: 

(1) the matrix of functions S;.(*) =T.i(«)Bai(x) is symmetric at each 
value x on the interval ad; 

(2) the quadratic form S.s(x)£.s is non-negative at each value x on ab; 

(3) the set y;(«) =0 is the only set of functions which satisfies on ab the 
conditions 


(2.1) yi = Aiea; M iaVa(a) + NiaVa(b) = 0, SasVave = 0. 


* Cope, An analogue of Jacobi’s condition for the problem of Mayer with variable end-points, 
Dissertation, University of Chicago, 1927. For a synopsis see Abstracts of Theses, The University of 
Chicago, vol. 6 (1927-1928), pp. 15-21. See also American Journal of Mathematics, vol. 59 (1937), 
pp. 655-672. 

t Hu, The problem of Bolza and its accessory boundary value problem, Contributions to the Calculus 
of Variations 1931-1932, The University of Chicago Press, p. 400; Morse, Sufficient conditions in the 
problem of Lagrange with variable end conditions, American Journal of Mathematics, vol. 53 (1931), 
pp. 517-546, especially §16. 
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The property (3) is analogous to normality in the calculus of variations, 
as will be shown in a later section. Since the quadratic form with matrix S;; is 
non-negative, it follows that every set of values y; which satisfy the equation 
SasVe¥s=0 must also satisfy S;2zy2=0 and consequently the equations 
Bia¥a=0, since the determinant | 7;| is different from zero. In the conditions 
(2.1) we can therefore replace the last equation by B;.y.=0 if desirable. 

Let Vix(«, X) be the elements of a matrix whose columns are x linearly 
independent solutions of the differential equations in (1.1), and let s;(y) repre- 
sent the first member of the second equation (1.1). Then the characteristic 
numbers of the boundary value problem are the roots of the determinant 
given by 


DQ) = | si[¥ x(x, 


in which the symbol s;[Vx(x, \)] represents the value of s;(y) formed for the 
kth column of the matrix of elements Y;,. The index of a root Xo of D(A) is by 
definition the number 7 when n—,r is the rank of D(Ao), and the multiplicity 
of Xo is its multiplicity as a root of D(A). 


THEOREM 2.1. For a definitely self-adjoint boundary value problem every 
root of the determinant D(d) is real, and the independent characteristic solutions 
of the boundary value problem corresponding to such a root may be chosen real. 


For suppose that y;=ya+(—1)"*yi2 were a solution of the boundary 
value problem, not identically zero and corresponding to an imaginary root 
+(—1)"*2 of D(A). Then the conjugate imaginary set 9; = — (—1)"/2yi2 
would be a solution corresponding to the root \=\1—(—1)"/*A2. According 
to I, Theorem 8, we would have 


Sasva¥e = SasVar + SapVa2Ve2 = 0. 


This would imply a contradiction since by a remark made above the equa- 
tions BiaVai = BiaVa2 =0 would be consequences of the last equation, and one 
verifies readily by substitution in (1.1) that the functions ya(x) and yi2(x) 
would satisfy the equations (2.1) and hence be identically zero. 


THEOREM 2.2. For a definitely self-adjoint boundary value problem the index 
of every root of D(d) is equal to its multiplicity. 

The proof is identical with that of I, Theorem 10, down to the last equa- 
tion on page 572 which would again imply Bizya=0 and y;=0, as in the 
paragraph above preceding Theorem 2.2, and this would be a contradiction 
since the functions y,, in the proof are not identically zero. 

For the new definition of definite self-adjointness the Theorem 11 of the 
paper I will be replaced by the following theorem which is analogous to a 
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theorem of Hu* for boundary value problems of the calculus of variations: 


THEOREM 2.3. If for a set of functions f;(x) continuous on the interval ab the 
condition 


b 
(2.2) f SasVafgdx = 0 


is satisfied by every solution y;(x) of a definitely self-adjoint boundary value prob- 
lem, then it is also satisfied by every set of functions y;(x) satisfying the following 
equations 


(2.3) yi = Aiavat = 0 
with functions g(x) continuous on the interval ab. 


The condition (2.2) for all solutions y;(x) of the boundary value problem 
implies, as in the proof of I, Theorem 11, that the non-homogeneous system 


yi - (Aia + ABia) Va + Biafa, si(y) = 0 
has a solution y;(x, \) expressible by power series 

yi(x, d) = + + uio(x)d? 
whose coefficients u;,(x) (u=0, 1, 2, - - - ) have continuous derivatives on the 
interval ab, and which converge uniformly for values x, \ satisfying conditions 
of the form a<x<b, |X| <p. From the proof of I, Theorem 11, it follows that 
(2.4) A jattao + Biafa; 0, 


and also that 
b 
Wo = f Sagtartgodx => 0. 


From the last equation and the properties (1)—(3) we deduce the identities 
Biatao=0. Consider now a set of functions y;(x) which satisfy equations of 
the form (2.3). From (2.3) and the equations (19), (20) of I it follows that 
the corresponding functions 2;=T7i2y_ satisfy the equations 


(2.5) ZaA ai BaiT t,(z) = 0, 
where #,(z) is a symbol for the first member of the second equation (1.2). From 
(2.4) and (2.5) and the identities B;.%.0=0, we have 

Zatlao + Zattao = SapVafs; 


* Hu, loc. cit., Theorem 7.3, p. 396. 
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and hence with the help of equation (7) of I we find that every set of functions 
y:; which satisfy the equations (2.3) will also satisfy (2.2), as was to be 
demonstrated. 


Corotrary 2.1. If the determinant |Bjx(x)| is different from zero on the 
interval ab, then {;=0 is the only set of functions which satisfy the condition (2.2) 
with every solution y;(x) of a definitely self-adjoint boundary value problem. 


This follows from the equations (2.4) and the identities w..=0 which are 
consequences of the identities Bia%.0o=0 when the determinant | B;,| is no- 
where zero. 


Coro.iary 2.2. If the functions f; satisfy equations of the form 
fi = Aiafa + Biaga; si(f) = 0 


with functions gi(x) continuous on the interval ab, and if they satisfy the condi- 
tion (2.2) with every solution of a definitely self-adjoint boundary value problem, 
then they also satisfy the identities Biafa=0. 


This follows readily from Theorem 2.3 when we note that the functions 
yi=f; satisfy equations of the form (2.3), and therefore from (2.2) that 


= 0. 


By reasoning similar to that used a number of times above it follows then 
that Biafa =0. 

3. The expansion theorems. Since the roots of the power series D(A) form 
a finite or infinite denumerable set, and since the number of linearly inde- 
pendent solutions y;(x) of the boundary value problem associated with each 
root is equal to the multiplicity of the root, it follows that the solutions and 
their corresponding characteristic numbers can be enumerated and denoted 
by symbols y,,(x), A, (v=1, 2,---). Furthermore these solutions can be 
normed and orthogonalized by well known processes so that 


b 
(3 1) f S apVauVardx ou» 


where 6,,=1, 6,,=0 if uv. For an arbitrary set of functions f;(x) continuous 
on the interval ab the constants c, may be defined by the equations 


b 
(3.2) f S Yoo 


The fundamental Theorem 13 of the paper I needs a different proof with the 


| 

. 

| 

| 
| 

| 

| 

| 


418 G. A. BLISS [November 


new definition of definite self-adjointness. It may be written in the following 
form: 


THEOREM 3.1. For every set of functions f(x) satisfying equations of the form 


(3.3) fi = A iafa + si(f) = 0 
with functions g;(x) continuous on the interval ab, the series 


converge uniformly on the interval ab, and Bia(fa—ba) =9. 


The sums ¢; may contain only a finite number of terms if the set of char- 
acteristic numbers \, is finite. But the uniform convergence of these series can 
in every case be proved as in I, §6. To prove the identities Bia(fa—¢.) =0 
we note first that for every v 


(3.5) f Sas(fa — ba) Vadx = 0 


because of the equations (3.1) and (3.2). From Theorem 2.3 it follows there- 
fore that 


b 
(3.6) f — bafsdx = 0 


since the functions f; by hypothesis satisfy the equations (3.3) which are of 
the form (2.3). Furthermore 


b 
(3.7) f ~ 


because of the form of the functions (3.4) and the relations (3.5). By sub- 
tracting (3.7) from (3.6) we find that 


f Sus(fa — ba) (fo — s)dx = 0; 


hence, by the usual argument, we obtain the desired identities. 


Corotiary 3.1. If the determinant | B,.(x)| is nowhere zero on the interval 
ab, then for every set of functions f(x) having continuous derivatives on that 
interval and satisfying the boundary conditions s;(f)=0 the sums (3.4) converge 
uniformly and are equal to the functions f; on the interval ab. 


The corollary is identical with Corollary 1 of paper I and is proved in the 
same way. 
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Coroiiary 3.2. If the functions f;(x) satisfy equations of the form (3.3), 
and if furthermore the functions g;(x) in those equations are solutions of a similar 
system 


gi A + Biaha, si(g) = 0 


with functions h;(x) continuous on the interval ab, then the series (3.4) converge 
uniformly and are equal to the functions f;(x). 


This is Corollary 2 of I, page 576, but its proof needs emendation for the 
new definition of definite self-adjointness. We use the notations 


b 
The equation 


(3.8) = A iaha + 


is equation (39) of I and is proved in the same way. From (3.3), (3.8), and 
the fact that s;(y,) =0 it follows that 


(3.9) fi — 9} = Aial(fa — ba) + Bialga — Wa), si(f — ) = 0. 


The last term in the first equation (3.9) vanishes identically since the equa- 
tions B;a(ga—Wa) =0 are consequences of Theorem 3.1 applied to the func- 
tions g; in place of the f;. The similar identities Bia(fa—@a2) =0 for the 
functions f;, from Theorem 3.1, imply that (f2—@a)Sas(fs—$s) =0 and hence 
from equations (3.9) and the property (3) in the definition of definite self- 
adjointness that f;-—¢;=0. 

4. The boundary value problem associated with a problem of Bolza. The 
second variation of the problem of Bolza may be taken in the form 


Jo(§, n) [é:, n(x1), n(x2) | + f 2w(x, n')dx 


in which 2y is a homogeneous quadratic form in its 2n+2 arguments &, 
ni(xi), ni(%e) (i=1,--+-, m), and 2w is a homogeneous quadratic form in 
the 2 variables 7;(x), n/ (x) with coefficients functions of x. 

An accessory minimum problem associated with this second variation is 
that of finding in a class of sets £1, £, 7:(x), satisfying conditions of the form 


7, n’) = 0 (@=1,---,m<n), 
W, &, = 0 (u=1,---,pS 2n+2), 
f nnidx = 1, 
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one which minimizes J2(é, 7). The functions &g are homogeneous and linear 
in the 2n variables 7;, n/ with coefficients functions of x, and the functions V, 
are p homogeneous linear independent expressions with constant coefficients 
in their 2n+2 arguments. 

The differential equations and end conditions for a minimizing set £1, £, 
ni(x), for the accessory problem may be expressed with the help of the nota- 
tions 


= w + = 
as follows :* 


(4.1) dQ,,:/dx = — Ani, % = 0, 


= 0, 
= Fa + Yi + = 0, 


(4.2) v2 + = 0, 
t+ v2 + = 0, 
»=0. 


The notations nis, fis (s=1, 2) represent the values 7;(x.), ¢:(x.) (s=1, 2). 
The subscripts 1, 2, 71, 72 attached to y and YW, indicate partial derivatives 
with respect to £1, £, ni, 2, respectively. It is to be shown that the equations 
(4.1) and (4.2) are equivalent to a boundary value problem of the type stud- 
ied in the preceding sections. 

The accessory minimum problem is said to be non-singular if the determi- 
nant of coefficients of the variables 7;, ws in the first members of the equations 


$(x, 7, 7’) = 0 


is everywhere different from zero on the interval xx. It is said to satisfy the 
non-tangency condition if the equations Y, =0 have no non-vanishing solution 
£1, £2, ni(x1), ni(xe) with =7;(x2) =0, or, in other words, if the matrix of 
coefficients of £, and & in the functions W, is of rank 2. Finally the accessory 
problem is said to be normal if the only solution &, &, 7:(x), us(x), €, of equa- 
tions (4.1) and (4.2) with ;(x) =0 on the interval x72 is the one whose ele- 
ments all vanish identically.t 


(4.3) 


* Bliss, The problem of Bolza in the calculus of variations, mimeographed lecture notes, The 
University of Chicago, 1935, p. 73, equations (14.1), and p. 76, equation (14.8). 

+ These definitions are customary ones. See, for example, Bliss, The problem of Bolza in the calcu- 
lus of variations, loc. cit., pp. 34, 82, and §§9, 10. In the two sections cited the notion of normality is 
analyzed in considerable detail. 
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The differential equations (4.1) can be expressed in terms of the so-called 
canonical variables x, 7:, ¢; related to the variables x, 7:, n/, us by means of 
the equations (4.3). If the accessory minimum problem is non-singular, these 
equations have solutions 

which are linear in the variables 7;, ¢;. In terms of the homogeneous quadratic 
form in 7, ¢; defined by the equation 
25¢ = — 20(x, n, M) 


= + 2Vii(x) nits + Wasa) 
the differential equations (4.1) take the well known canonical form 
4.4) dni/dx = Kz, = Vung + 
= — Ky, — = — Vieni — — 


The matrices of elements U;; and W;; are, of course, symmetric. 

The end conditions (4.2) can also be transformed into a more convenient 
form. Let 
(4.5) fi, £2, Nil, Ni2, fa, Eu, 
(4.6) a1, Qe, au, ba, bie, Oy 
be two sets satisfying those conditions. If we multiply the first four equations 
in (4.2), respectively, by a1, @i1, a, @i2 and add, and then subtract the similar 
sum with the two solutions interchanged, it follows from the fifth equation 
(4.2) and well known properties of quadratic forms, that 
(4.7) bana — Gata — + = 0. 


Consider now 2n linearly independent solutions of equations (4.2) 


(4.8) i, kl i, k2; Sima, Fine, 
for i=1,---, m. If the accessory minimum problem satisfies the non-tan- 


gency condition, the elements 7, ¢, a, b in the sets (4.8) form a 2m X4n-dimen- 
sional matrix which is of rank 2”. Otherwise there would be a solution (4.5) 
of equations (4.2) with elements 7, ¢ all zero, formed by taking a linear com- 
bination of the 2” solutions (4.8) with constant coefficients not all zero. The 
elements £, & of this solution would also vanish, on account of the fifth of 
equations (4.2) and the non-tangency condition. The elements ¢, would then 
also vanish because of the first four of equations (4.2) and the independence 
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of the functions ¥,. This would contradict the independence of the 2” solu- 
tions (4.8). It is evident then that the 2” equations 


— — + = O, 


(4.9) 


related to the set (4.8) as (4.7) is to (4.6), are linearly independent. They are 
linear combinations of the equations (4.2) and are equivalent to this latter 
system in the sense that with every set of values na, nie, fi, 2 satisfying 
equations (4.9) there is associated a unique solution of equations (4.2) whose 
other elements &, £, €, are determined by the first, third, and fifth of equa- 
tions (4.2). The equations (4.4) and (4.9) define a boundary value problem 
for the 2m functions 7;(x), ¢:(~) analogous to that characterized by the equa- 
tions (1.1) for the functions y,(x). 


THEOREM 4.1. For a problem of Bolza having a non-singular normal acces- 
sory minimum problem satisfying the non-tangency condition the boundary value 
problem defined by equations (4.4) and (4.9) is definitely self-adjoint according 
to the definition in §2 above. 

To prove this theorem we note first that necessary and sufficient condi- 
tions for the system (1.1) to be self-adjoint, taken from equations (19) and 
(20) of the paper I, can be expressed by use of matrix notation in the form 


(4.10) TA+AT+T7'=0, TB+BT=0, MT-(a)M = NT-(0)N, 
where the bars indicate transposed matrices and 7’ is the matrix of deriva- 


tives of the elements of T. For the boundary value problem defined by equa- 
tions (4.4) and (4.9) the matrices involved are the 2n-dimensional matrices 


Vix Wij 0 
A = B = 
These satisfy the equations (4.10) with the special transformation matrix 


(1.0) 
T= 
— 6% O 


In proving the first equation (4.10) use is made of the symmetry of the mat- 
rices U and W, and in proving the third equation relation (4.7) for the various 
pairs of the solutions (4.8) is needed. The matrix S=7B of §2 above is 
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Evidently this matrix is symmetric and its quadratic form is non-negative. 
The only functions 7;(x), ¢:(x) which make this quadratic form vanish identi- 
cally have the form 7;(x) =0, ¢:(x), and no set of functions of this type can 
satisfy equations (4.4) with \=0 and the end conditions (4.9). Otherwise 
there would be a related solution £1, £, :(x), us(x), €, of the equations (4.1) 
and (4.2) with 7:(x) =0 on the interval ax, which is impossible when the ac- 
cessory minimum problem is normal. Thus all of the conditions (1), (2), (3) 
of the definition of definite self-adjointness in §2 are satisfied by the boundary 
value problem associated with equations (4.4) and (4.9), as stated in Theorem 
4.1. 

The assumption of the non-tangency condition can be omitted, as has re- 
cently been suggested to me by W. T. Reid, if the formulation of the acces- 
sory minimum problem is slightly modified. The constants & and £ in this 
problem can be replaced by the values 7n41(%1), tn+2(%2) of two functions | 
Nn+1(X), Nn42(X) subjected to differential equations 


which are to be adjoined to the equations ®;=0. In the norming integral in 
the second paragraph of this section the integrand is to be replaced by the 
sum of the squares of all of the variables y,(x) (o =1, - - - ,#+2). One verifies 
readily then that for the new problem the end conditions contain only equa- 
tions of the form of the second, fourth, and last of the equations (4.2) and 
the construction of the end conditions (4.9) does not involve the non-tangency 
condition. 

5. Transformations and examples. If a definitely self-adjoint boundary 
value problem of the form (1.1) for a set of functions y;(x) is transformed 
into one for functions u;(x) by a non-singular transformation y;=Ui(")m, 
the property of definite self-adjointness will be preserved. This can be verified 
by means of the following useful and easily derived transformation formulas, 
in which the subscript 1 designates the matrices associated with the trans- 
formed problem: 


A, = U""AU — Bi = U-"BU, 
5.1) M; = MU(a), M = NU(b), 
P, = U-\(a)P, Qi = U-()Q, 

T: = UTU, Si: = USU. 


It is understood that in these formulas a bar indicates a transposed matrix 
and a prime a matrix of derivatives. With the help of these relations one can 
readily deduce normal forms for definitely self-adjoint boundary value prob- 
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lems when the equations involve only two functions ; and y2 and the rank of 
the matrix B(x), and consequently of S(x), is constant on the interval ab.* 

Consider first the case when the determinant of B(x) is everywhere differ- 
ent from zero on the interval ab and the matrix S(x) therefore positive defi- 
nite. From the second equation (4.10) and the symmetry of S = 7B it follows 
that 


O=TB+Br=(T+T7)B, O=T+T, 


so that T is skew-symmetric. There exists a transformation U taking S into 
the identity matrix, and we have then 


0 1 0 
—t 0 01 


0) 


with the help of the relation S=7B. The transformation 


now transforms these matrices into the forms shown in (5.3) below, with 
s=1/t. One can readily verify the fact that the most general transformation 
leaving T and S in (5.3) invariant has the form 


— 2 23 
(5.2) ), + SU = 1, 
Uar M11 


and that it will also leave B in (5.3) invariant under the transformation (5.1). 
The lower left-hand element of A; after such a transformation when set equal 
to zero, and the derivative of the last equation above, have the forms 


— + = 0, 


, 2 , , 2 
+ S Ue Ue, + SS = O, 


where the dots indicate terms not containing derivatives of the elements 1;;. 
If uw, and ws; are determined by these differential equations with initial values 
at a single point satisfying the second equation (5.2), they will satisfy that 
equation identically. The first equation (4.10) shows that a2=—an, and we 
have the following theorem: 

* For a more complete classification see Bamforth, A classification of boundary value problems 


for a system of ordinary differential equations of the second order, Dissertation, University of Chicago, 
1927. 
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THEOREM 5.1. When n=2 every definitely self-adjoint boundary value prob- 
lem with | B| +0 on the interval ab is transformable into a problem with matrices 


of the form 
ing 
0 


0 1 
—1 0 


with s(x) 0 on ab and |M| =| N|. Conversely, every problem with these prop- 
erties is definitely self-adjoint. Such a problem has always an infinity of charac- 
teristic numbers. 

The relation (20) of I shows that | M| =| |. It is evident that the func- 
tions f; described in Corollary 3.1 above could not all be expansible as there 
stated if there were only a finite set of characteristic numbers and functions. 

The case when the rank of the matrix B(x) is unity everywhere on the in- 
terval ab gives rise to a number of normal forms of definitely self-adjoint prob- 
lems. The matrix S(x) is then transformable into 


From the formula S=T7B and this form of S it can readily be seen that the 
matrix B and the most general transformation U leaving S invariant have the 


forms 
by 0 +1 0 
B= ), u =( ). 
bo 0 Ua 


Since B has rank unity the elements bu, be: do not vanish simultaneously, 
and a transformation U with leading element +1 can be chosen so that 
= — bey ~0. Such a transformation will take B into the form 


(5.3) 


by means of the second of the formulas (5.1). The conditions TB +BT =0, 
S=TB, |T| 0 then imply that b vanishes identically, and that T has the 
form 


(5.5) T= (| 


With the help of the first equation (19) of I we find that 


6) 
Ba 
—1 0 
S= 
0 s? 
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M12 
(5.6) A= t=cexp| — 2audx|, dit = 0. 
a2) a 


The most general transformation U leaving invariant the matrices S and 
(5.4) with b=0 is found to be 


+1 


and this will also leave T invariant. By means of such a transformation we 
can make the lower left-hand element in A vanish identically, by a method 
similar to that used above. The following theorem can then be established 
without serious difficulty: 

THEOREM 5.2. When n=2 every definitely self-adjoint boundary value prob- 
lem with matrix B(x) identically of rank one on the interval ab is transformable 
into a problem with matrices of the form 


When a:240 the only transformation matrix possible is 


and the problem is definitely self-adjoint if and only if the end conditions have 
matrices M =(mix), N=(nix) with |M| =|N]. 
When ay.=0 the possible transformation matrices have the form 


t 1 
T= ( ), t=c exp| f 2ands 
—-1 0 


where c is a constant. The problem is definitely self-adjoint with a matrix T hav- 
ing c#0 if and only if the matrices M and N of the end conditions have equal 
determinanis and satisfy the conditions 


b 
My = = Ned, & = exp |-f ands, (my, M22) (0, 0). 


The problem is definitely self-adjoint with a matrix T having c=0 if and only 
if M and N have equal determinants and 


(m2 + m2, + Nooo) (0, 0). 


To prove the second statement of the theorem we note that the last two 
equations (5.6) imply when ay, 0, and that equation (20) of Tis satisfied 


4 

|_| 0 4 

U = 4 

u +1 

? 
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only when | M| =| WN]. If these conditions are fulfilled, the problem is self- 
adjoint. It is also definitely self-adjoint, according to the definition of §3 
above, since the equations 


(5.9) yi = + yz =—anyz, Sy = 0) = (0, 0) 


imply that y2 must vanish identically when a.2.#0. 

For the case when a;:=0 the arguments of the preceding paragraphs show 
that the only possible transformation matrix is (5.5) with ¢ satisfying the con- 
ditions in (5.6). The third equation (4.10) implies | M@| =| | and 


2 2 2 2 2 2 2 
Mi2 — = = M2 — = 0. 


For definite self-adjointness the solution 


= 0, yo = ye(a) exp |-f ands | 


of equations (5.9) must vanish identically if it satisfies also the end conditions 
of the problem, that is, if 


(5.10) + y2(a) = 0, + yo(b) = 0. 


This will be true if and only if the coefficients of y2(a) in the last two equations 
are not both zero. The statements in the theorem now follow readily from 
equations (5.9) and (5.10) 

One can construct without difficulty definitely self-adjoint boundary 
value problems which have only a finite number of characteristic num- 
bers. For example, the problem with the matrices 


is definitely self-adjoint with the matrix (5.8) and has the determinant 
D(d) =1, and hence has no characteristic numbers. The problem with the 
same matrices A, B and end-matrices 


1 -1 0 O 
M = ( ), N= ( ) 
0 0 1 1 
is definitely self-adjoint and has D(A) =2—A(b—a). It has a single charac- 
teristic number \=2/(b—a). When 


1 0 ae | 0 
) 
0 1 0 —!1 


} 
a 
4 
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the problem is self-adjoint but not definitely so, and the determinant D(A) 
vanishes identically. These examples are transforms into the normal forms 
described above of some equally simple ones communicated to me by Pro- 
fessor W. T. Reid. They show that the property of definite self-adjointness 
does not imply an infinity of characteristic numbers. 

The boundary value problems arising from problems of Bolza in the plane 
are all of the first type described in Theorem 5.2 and have ay everywhere 
different from zero. Theorem 3.1 shows that in this case every function f,(x) 
with a continuous second derivative on the interval ab is expansible in the 
form (3.4), provided only that it satisfies the conditions (3.3) with some func- 
tions fe and g, at x=a and x=b. It is evident that such problems must have 
an infinity of characteristic numbers since otherwise such expansions would 
not be possible in all cases. 


UNIVERSITY OF CHICAGO, 
Cuicaco, IL. 
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SOME EXISTENCE THEOREMS IN THE CALCULUS 
OF VARIATIONS 


I. THE DRESDEN CORNER CONDITION* 


BY 
E. J. MCSHANE 


The present note is the first of a series, and is the only one which does 
violence to the general title by failing to exhibit an existence theorem. In- 
stead, we here perform a few rather easy computations for later use, and by 
them obtain a simple proof of a corner condition for isoperimetric prob!ems. 
This corner condition was apparently first established by Dresden as a con- 
sequence of the Weierstrass condition, so only the method of proof here can 
qualify as new. However, it will serve as a suggestive guide to further 
theorems in later papers. 

1. Notation and continuity assumptions. In order that some of our pre- 
liminary calculations and lemmas shall be valid for both parametric and 
non-parametric problems, we shall recast non-parametric problems in para- 
metric form. Given an integrand 


f(x, y’) = f(x, y', 
we define 
F(z, 2’) = F(2°, z}, 2”, 2”) 
ov 
(1.1) = 25 5), 2” >0, 
F(z, 0) = 0. 


We use a modification of the tensor summation convention. The repetition 
of a Greek-letter affix in a term requires the summation of the values of that 
term over all values of the affix. Thus 


F Zn = Fro(Zn, +--+ + Zn 


is summed over a=0, 1, - - - , g, but not summed over the values of m. Also, 
if vy is a vector, its length will be denoted by |»|. Thus if z=(2°, - - - , 2°), 
then |z| = 


* Presented to the Society, December 28, 1937; received by the editors October 29, 1937 and, 
in revised form, February 16, 1938. 
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Throughout the papeis of this series all integrands f(x, y, y’) will be as- 
sumed (unless specific statement is made to the contrary) to be defined and 
continuous together with their partial derivatives of first and second order 
for all (x, y) in a closed set S, and all y’. Furthermore, we assume without 
further mention the following: 


(1.2) For every bounded subset So of S there is a constant N such that 
f(x, y; y’) + >0 
for all (x, y) in So and all y’. 


In the parametric notation, F(z, z’) being defined by (1.1), this takes the 
form: 


(1.2’) For every bounded subset So of S there is a constant N such that 
F(z, s')+N|z’| 20 for all z in So and all 2’ with 2°’ >0. 


Parametric integrands F(z, z’) (not those arising by (1.1) from non- 
parametric integrands) will be assumed to be defined and continuous for all z 
in a closed set S and all z’, to be positively homogeneous of degree 1 in 2’, 
and to have continuous partial derivatives of the first and second orders for 
all z in S and for all 2’ (0, - - - , 0). 

In order to avoid frequent printing of a complicated symbol we make the 
definitions 


fix, y, y’) = y, y’), i=1,---,q, 
F(z, 2’) =F A(z, 2’), i=0,1,---, q: 


From the assumptions made on integrands f(x, y, y’) it follows at once 
that the function F(z, z’) defined by (1.1) is continuous with its partial de- 
rivatives of first and second orders for all z=(x, y) in S and all zs’ with 2°’ >0, 
and is positively homogeneous of degree 1 in z’. Whenever we are discussing 
the integral of such an integrand along a curve C: s=2(¢) it will be tacitly 
assumed that 2°’(#) >0 for almost all ¢, while for all auxiliary curves used in 
demonstrations we must prove 2" (¢) >0 for almost all ¢. 

If the representation z=<(t), (4 <¢<t), of the curve C satisfies this condi- 
tion, the function* F(z(¢), 2(#)) is measurable. Since by (1.2’) it exceeds a 
summable function —N| z(t)| , it has an integral, finite or infinite. Provided 
that the functions z‘(¢) are absolutely continuous, we denote this integral by 


= J F(z, #)dt = f (s(t), 


i 


* In analogy with Carathéodory’s notation, we define 2(¢) to be the vector (2 (#),- , 2@’(é)) 
if the vector is defined and finite; otherwise 2(t) is defined to be 0 (i.e., (0, - - - , 0)). 
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Any two absolutely continuous parametrizations of C (subject to the require- 
ment that z°’(#) >0 for almost all #) give the same value to the integral.* In 
particular, if 2°(#) =¢, so that C has the form y = y(x), (aS Sb), and if further- 
more the y‘(x) are absolutely continuous, we write 


If F(z, z’) is a parametric integrand, defined for all z in a set S and all 2’, 
these questions of integrability do not arise. We write in this case also 


te 
F(C) = f F(z, 2)dt = f F(2(t), 2(#))dt, 
c ty 
provided that the functions z‘(¢) are absolutely continuous. The invariance 
under change of parameter is well known in this case. (It also follows from 
the theorem cited in the preceding footnote.) 
2. Differentiation formulas. Let z =2(c), (010 be a Lipschitzian rep- 
resentation of a curve C. We shall suppose that z=¢(7) and z=y(r) are two 
curves passing through the ends of C, so that 


(2.1) $(70) = 2(01), = 2(o2), 


and we shall also suppose that ¢ and y are absolutely continuous on an inter- 
val (a, b) containing 7, and have finite derivatives for 7 =7o. 
From C we form the curve C(r) defined by the equationf 


sf = ai(o, 7) = + — 
(2.2) 

+ (pi(r) — o1 So q. 


This (by (2.1)) joins to and 2(¢, 7) =2(0). We wish to calculate the 
derivative of ¥(C(r)), it being assumed that C(r) lies in S for a<7 <b. 
Since 


(2.3) = f 2), #6, 


by differentiating under the integral sign we obtain 


* E. J. McShane, Semi-continuity of integrals in the calculus of variations, Duke Mathematical 
Journal, vol. 2 (1936), pp. 597-616; in particular, Theorem 2.1 (the first five lines of proof being 
deleted). 

¢ The curve C(r) depends not merely on C and 7, but also on the particular parametrization 
z=2(0) which we chose for C. 
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¥(C(r))| = “(Fe =) de 


pate Or OT 
(2.4) -f i(o)) + 


O1 


+ Fa(s(e), £(0)) hae, 


If F(z, 2’) arises from a non-parametric integrand by (1.1), equation (2.4) 
is still valid if we add the hypothesis that 2°(¢) =m>0 for almost all c. 

[Several special cases of this formula will be of use to us. First, suppose 
=¥(r) —W (70), so that (2.2) represents merely a translation of the 
curve C by the (vector) amount ¢(7) —¢(ro). Then 


(2.5) ¥'(C(ro)) = (r0) f "Fe(s, 


(This is independent of the parametric representation of C.) 

[Suppose next that C is a line segment, and that C(r) is the line segment 
whose ends are 2(01)+77m and 2(¢2)+772, where 7; and zz are given vectors. 
If the functions z=z(¢) representing C are linear, then (2.2) represents the 
line segment C(r) if we take 


(2.6) o(r) = 2(01) + 7m and y(r) = 2(o2) 
so that (2.4) becomes 


1 91 02— 


of af 
Be. 


Applying the mean value theorem to the last term on the right, we obtain 


(2.7) 


(2.8) 
+ F.(2(s), — 


where 01 <@ <o2. 
3. Interchange of arcs. Let C, and C2 be rectifiable curves 


(3.1) Ci Ce s=2a(r), S72 


such that the beginning of C2 coincides with the end of C,: 
(3 2) 21(o2) = 


432 
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The functions z;(¢) and z(7) will be supposed to satisfy a Lipschitz condition, 
and if F(z, z’) is defined by (1.1) it will also be supposed that except on a set 
of measure zero 2,°(¢) and 2°(¢) are bounded from zero. We define Cy to be 
the curve obtained by traversing first Ci and then C2. Furthermore, we define 
Cx to be the curve obtained by starting at z:(0:), traversing a curve C2 which 
is a translation of C2, and then traversing a curve C** which is a translation 
of C,. It is clear that the end of Cx is the same as the end of C12, namely 
2o(7T2). We now assume that the point 


(3.3) + — 


lies in S whenever o,;S0So02 and 715772, and we proceed to compute 


F(Ca) — F(Cw). 
By the definitions, 


(3.4) F(Ca) — F(Cx) = — FCs) + FCH) — FC2). 


We compute 7(Ci*) — by (2.5), where we take ¢(r) =22(r) —21(o2) +21(o1), 
W(r) =20(r). If we momentarily let C(r) be the curve (which by hypothesis 
lies in S), z=2:(0) +20(r) —2e(71), (01 So2), we obtain by (2.5) 


for all 7 such that 2 (r) exists, that is, for almost all r. Hence, integrating 
from tT =7; to T=72, we obtain 


F(C¥) — = — FC) 


(3.6) 

-f f + — 22(71), 21(0))dodr. 
Next we let C’(c) be the curve —2:(o2), (71S7 S72). (Re- 

calling (3.2), we see that this lies by hypothesis in S$ for 0:50 <o2.) Then 

C'(e2) is C2; and C’(o:) is Cs, for it is a translation of C, and it starts at 

+21(01) —21(¢2), which is 2:(01) by (3.2). Again using (2.5), but with o 

serving in the role of 7, we find 


(3.7) F'(C'(@)) = f 2*(22(r) + — 21(02), 22(r))dr. 


Integrating from a; to o2, we have 


F(C2) — F(C#) = F(C'(e2)) — FC'(or)) 


(3.8) once + — 2:(02), 22(7))drdo. 


¢ 
& 
+ 
5 


434 E. J. MCSHANE [November 


With Carathéodory, we now define 
(3.9) Q(z, p, 9) = pF — p). 
Then from (3.4), (3.6), and (3.8) we obtain (recalling (3.2)) 


(3.10) — = — + 21(¢) — 21(¢2), 1(0), 2(7))dodr. 


We digress for a moment to make some remarks on the Q-function. It is 
clear that it is continuous for all z, p, and q, if F is the usual parametric inte- 
grand. If F arises by (1.1), then Q is continuous if p°>0 and g°>0. From the 
definition, Q(z, p, g) is positively homogeneous of degree 1 in p and in g: 


(3.11) Q(z, kp, xq) = kxQ(z, p,q) if k>O and «>O. 
Also 
(3.12) Az, p, q) Q(z, q; p). 


Suppose that F(z, z’) arises by (1.1) from f(x, y, y’), where (2°, s1,--- , 3%) 
=(x, y',-- +, y%), and that the curves Ci, C2 are represented in the form 


Ci: y= C22 y= 


where y:(b) = ye(b). We can if we wish revert to the integrand f by simply us- 
ing x as parameter (it being still assumed that y,(”) and ye(x) are Lipschitzian). 
Then on we have =x; so 2;° =o =x, and =1. Likewise (r) =1, so 
that by (3.10) 


F(Car) — F(Cr2) 


(3. 13) c b 

--f fo 4, + 106), HOO, inlr))doh dr, 
b a 

where 
w(x, y’, Y’) = f x(x, y; Y’) f2(x, y’) 

(3.14) + ya(x, Y’) yo(x, y’); 

that is, 

(3.15) w(x, y, y’, Y’) = Q(x, y; 1, y’; 1, Y’). 


As a special case (useful later) we may assume that C; and C2 are line 
segments. If /;=Z(C;) is the length of C;, ({=1, 2), we can represent C; and C2 
by means of linear functions of arc lengths: 


(3.16) Ci; 2 =2,(s), 0sss1,, i= 1,2. 


Then the derivative ;(s) is constantly equal to a unit vector p;, (i=1, 2), 
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and the curves Cz and C2 bound a parallelogram (which may degenerate into 
a line segment). Formula (3.10) becomes 


lg 

2 0 0 

; By the theorem of mean value, 

(3.18) — = — + 22(F) — 22(0), pr, 


where 0 <o </, and 0<7<hk. If we denote 2:(¢) +-20(7) —22(0) by &, then £isa 
point of the parallelogram bounded by Cy and Cu, and (3.18) becomes 


(3.19) F(Car) — F(Crz) = — LileQ(E, fr, po). 


4. Dresden’s corner condition: the parametric form. In §4 we establish 
Dresden’s corner condition for isoperimetric problems in parametric form. 
We shall proceed under the following hypotheses: 


(4.1) The curve C:z=2(t), (4: St Sh), is interior to S and gives a strong rela- 
tive minimum to the integral . 


in the class of all D' curves which join 2(t,) to 2(t2) and give assigned values 
vi, (G=1, - m), to the integrals 


Cc 


(4.2) The curve C has a corner at 2(t). 
(4.3) The curve C is normal.* (Hence there is a unique set of multipliers 
-- - ,l,such that for the function 


H(z, 2’) = F(z, 2’) + 1aG(z, 2’) 
the first variation of {,H(z, 2)dt vanishes along C.) 
Then we have the theorem: 
THEOREM 1. Under the hypotheses (4.1), (4.2), and (4.3) the inequalityt 
(4.4) Qu(2(to), 2’(to — 0), 2’(to + 0)) SO 
holds. 


* We do not require normality on subarcs of C. 
t Here, where several integrands are involved, there are several possible 2-functions. The sub- 
a indicates the integrand used in defining 2 by (3.9). An analogous notation will be used for the 

-function. 
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For compactness, we define 
(4.5) p = 2'(to — 0), r=2'(t0+0). 
Suppose that (4.4) fails to hold; then 
(4.6) Qu(z(to), p, r) > O. 
By continuity, there is a positive number 6, so small that C has no corner 
other than on the arc z=2(#), —5<t<t)+4, for which 
(4.7) Qn(z(), 2’, r) > 0 if St < bp. 
Define C,,o to be the curve obtained from C by interchanging the arcs 
z=2(t), and where The curve 
C,,o is then defined by equations z=¢,(#), (4 where 
hit<t—, 
= 2(¢ + 6) — + — 5), 
=2), t+rstStke. 
(Thus Co, coincides with C. For 7 near 0, it is clear that C,,o is in S.) 


Given any set of m (vector) functions 7;(#) =(n/(é), - - - , 7;*(é)) of class 
D’ on 2] and vanishing at and we shall define 


@(r, b) = P(r, bi, bm) = 
ri(r, b) = bi, » bm) = 
where C,,, is given by the following: 
(4.10) defined by z=2(t; 7, b) +bana(t), (AStSh). 


For 7 and 6 near 0, this lies in S. As we know, 


(4.8) 


(4.9) 


(4.11) - a(2, 2) 2) pdt. 


by 


We have assumed that C is normal. Hence there exist functions 7;(?), - - 
of the type described above for which the jacobian 


(4.12) ~0. 


Ob, 
The equations 
(4.13) T(r, b) j = 


have the initial solutions r= 6 =0, and at this solution the jacobian with re- 


i 

4 

4 


1938] THE DRESDEN CORNER CONDITION 437 


spect to the b; is not zero by (4.12). So by the implicit functions theorem* 
there are functions b=6,(r), (k=1,---, m), defined and of class C’ for all 
small non-negative 7, such that 


(4.14) b(r)) = Vi» j 1, 
Let us define 


(4.15) H(r, b) = ®(r, b) + 6) = f H(z, 2)dt. 
Cc 


Since the first variation of {Hdt vanishes along C, we have 
oH 


(4.16) —=0 for r=)=0, 
Ob; 


By (3.10) and the mean value theorem, for b=0 we have 


for some # between f)— 6 and é. Combining (4.16) and (4.17) we have 
(4.18) Hie, = — + — = — <0 


for 7=0. But by (4.14) and (4.15), 
d d 

(4.19) — H(r, b(r)) = — G(r, b(7)). 
dr dr 


Now by (4.18) and (4.19) we see that 

(4.20) = B(7, b(7)) < B(0, b(0)) = 0) = F(C) 
for all small positive 7, while by (4.14) 

(4.21) = b(r)) = 


This contradicts the minimizing property of C, and establishes the theorem. 

5. Dresden’s corner condition: the non-parametric form. Inorderthat the 
proof of Theorem 1 shall apply also to integrands F(z, 2’), G(z, 2’) arising 
by (1.1), it is only necessary to verify that 2°’(¢)>0 along each of the com- 
parison curves C,, used. If C has a class D’ representation z‘=z‘(z°), 


* The fact that Ii is defined only for r=0 does not prevent our use of the theorem; we could 
for example extend the range of definition of I’ as a function of 7 and then finally disregard the 
extension. 
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(i=1,---,q), then on identifying the parameter ¢ with 2° we have 2°(t) =1 
along C. But along C,,4,) we have 


d 
= Fa%(t) + = 1+ = 1 —| b| max | > 3 


if |b| be small enough. So the comparison curves used, for which | }| is near 0, 
are allowable, and the proof applies to the non-parametric case also. Recalling 
(3.15), we have the following result: 

THEOREM 2. Assume the following conditions satisfied: 

(5.1) The curve y=y(x), (x1 x <x2), is interior to S and gives a strong rela- 
tive minimum to the integral 


Fly] = f “Wx, y, 


in the class of all D’ curves which join (21, y(%1)) to (xe, y(xe)) and give assigned 
values  ; to the integrals 


G'[y] =f gi(x, y, y)dx, 


(5.2) y=y(x) has a corner at xo. 
(5.3) The curve y=y(x) is normal. 
(5.4) 1,h, +--+, lm are the isoperimetric constants, and 


h(x, y, y’) = f(x, y, 9’) + lag(x, 


Then 
y(X0), y’(%o — 0), + 0)) 
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SOME EXISTENCE THEOREMS IN THE CALCULUS 
OF VARIATIONS 


II. EXISTENCE THEOREMS FOR ISOPERIMETRIC 
PROBLEMS IN THE PLANE* 


BY 
E. J. McSHANE 


If we seek to find a curve y= y(x), (a: S$” <4%x2), which minimizes an integral 
¥[y]=Sf(x, in the class of curves joining two points (21, y1) and (22, v2), 
a reasonable beginning is to choose a sequence y = y,,(x) for which ¥[y,] tends 
to the lower bound u of values of ¥[y], and then (under suitable hypotheses) 
show that a subsequence of the y,(x) tends uniformly to a limit function 
yo(x). However, the uniform convergence of y, to yo does not ensure that 
¥[y,] tends to ¥[yo]. One way of overcoming this difficulty is to assume that 
¥[y] is quasi-regular, from which we find ¥[yo]<lim inf ¥[y,]=y. Since 
¥[yo| cannot be less than yp, by the definition of y, it follows that ¥[yo] =u. 
This method of attack was devised by L. Tonelli, and has been applied by 
him and others, including Graves, Mania, Cinquini, and myself, to a number 
of different types of variation problems. A second method is to add hy- 
potheses f(x, y, y’) which will guarantee that for some sequence {y,(x)} not 
only does y,(x) converge uniformly to yo(x), but also y,/ (x) tends in some 
manner to y¢ (x), so that f(x, yo(x), yo(x)) is the limit of f(x, y,(x), 3,(x)) in 
some manner which will ensure the convergence of ¥[y,] to ¥[yo]. This 
method does not seem to have been nearly as thoroughly exploited as the 
first. A very interesting existence theorem, established by this type of reason- 
ing, is to be found in a paper by Hans Lewy.f 

Consider now the isoperimetric problem of minimizing ¥[y] while keeping 
Gly|=/Sg(«, y, y’)dx constantly equal to a number y. If we try to use the 
first of these methods, we find a sequence for which ¥[y|—min. while 
Gly.]=y. But now the limit curve y=yo(x) must satisfy the equation 
G[yo]=y=lim G|[y,], and in order that this shall follow from the uniform 
convergence of y, to yo the integrand g(x, y, y’) must be strongly restricted, 
in fact, it must be linear in y’. And, in fact, for isoperimetric problems in 


* Presented to the Society, April 15, 1938; received by the editors October 29, 1937 and, in 
revised form, February 16, 1938. 

+ H. Lewy, Ueber die Methode der Differenzengleichungen . .. , Mathematische Annalen, vol. 98 
(1928), pp. 107-124. 
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non-parametric form the only existence theorems* known to me make exactly 
this requirement on G'[y] or on F[y]. 

This suggests turning to the second proof-pattern, and putting conditions 
on F and G' which will guarantee that y,' (x) tends to y¢ (x) in some manner 
strong enough to ensure that G'[y,,] tends to G[yo] and ¥[y,] tends to ¥[yo]. 
In this note (and again in the fourth and fifth of the series) we set forth con- 
ditions guaranteeing this. 

The method here used is based on a formula (3.18) of I. If II is a polygon, 
and two consecutive sides of II have slopes a, 8, respectively, and if it be 
known that 


w;(x, y, &, B) 20 


for all (x, y), then the interchange of these sides does not increase 7[y]. 
Furthermore, if the integrand g(x, y, y’) happens to be independent of x and y, 
the interchange leaves G(II) unaltered. Suppose then that w,(x, y, p, r) 20 
if p=r. If we choose a minimizing sequence of polygons II,,: y= y,(x), we may 
suppose that for each m the slope of the sides of II, increases monotonically. 
For if ever a side is succeeded by one of lesser slope, we may interchange these 
sides, leaving G unaltered and not increasing ¥. Therefore each function 
yn(x) is convex, and when we select a subsequence converging to a limit 
yo(x) it will follow that y,’ (x)—>yd¢ (x) for almost all x. 

In the theorem to be proved, we assume somewhat less than the condition 
that w;(x, y, p, r) 20 if p=r, but the essence of the proof is unchanged; our 
minimizing sequence is made to consist of several convex arcs. 

The notation and definitions used in I will be continued in this note. Also, 
we add the rather obvious abbreviation “a.c.” for “absolutely continuous.” 

1. Proofs of some lemmas. Suppose that F(z, 2’) is a parametric inte- 
grand having the continuity properties required in I, §1, and that C is a 
rectifiable curve. It is easy to show that if {II,} is a sequence of polygons 
inscribed in C and having the same initial and final points as C, and the 
length of the longest side of II, tends to 0 as n>, then F(II,,)—>7(C). For 
problems in non-parametric form this is no longer true, as may be shown by 
examples. f 

Suppose, however, that F(z, z’) is obtained from a non-parametric inte- 
grand f(x, y, y’) by (1.1) of I and satisfies the following condition: 


* L. Tonelli, Fondamenti di Calcolo delle Variazioni, vol. 2, pp. 552, 553. 

+ M. Lavrentieff, Sur quelques problémes du calcul des variations, Annali di Matematica, (4), 
vol. 4 (1927), pp. 7-28. 

L. Tonelli, Sur une question du calcul des variations, Matematicheskii Sbornik, vol. 33 (1936), 
pp. 87-98. 
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(1.1) For every bounded subset So of S there are positive numbers 6 and a and 
a number b=0 such that 
| Fu(z, 2’)| a| + bF (zo, 2’) 
whenever 20 is in Sy and |z—20| <6. 


Then it can be shown* that if y=y(x), (e<x*b), is a.c., then for every 
e>0 there exists a function y.(x) having a continuous derivative on a<x<b, 
such that y.(a) =y(a) and y.(b) =y(b), and such also that 


b b 


Moreover, if several integrands f!, - - - , f* are such that for each of them 
(1.1), holds an examination of Tonelli’s proof shows that y,(x) can be so 
chosen that 


< a7=1,---,k. 


b b 


In particular, if f(x, y, y’) is a function of y’ alone, (1.1) surely holds; we need 
only take a=6=1, b=0. 

We shall proceed to establish a theorem on isoperimetric problems. Sev- 
eral of the somewhat lengthy hypotheses of this theorem occur again as hy- 
potheses of later theorems, and some of the stages of the proof will also recur; 
so we separate the hypotheses and split the proof into a sequence of lemmas. 

The hypotheses for our first theorem are the following: 

(1.2) The functions f(x, y, vy’) and gi(x, y, y’), (G=1,-- +, m), satisfy the 
continuity conditions of I, §1, on a closed set Sj. 

(1.3) The functions f(x, y, y’) and gi(x, y, y’) satisfy (1.1) , and on every 
bounded portion of S the relations 


jim = @, 


hold uniformly in (x, y). 

(1.4) The class K consisting of all a.c. curves y = y(x) joining two fixed points 
(xo, Yo) and (X, Y) with x9<X and such that the integrals Gily] have given 
fixed values y;, =1, - - - , m), is not empty. 


* The proof requires only a minor modification of that given by Tonelli, loc. cit. The condition 
on Fy, or fz. here stated is superfluous in this connection. 


| 

j 

| 

] 
| 

a 

i 


442 E. J. McSHANE [November 


(1.5) The points (xo, yo) and (X, Y) and the integrals F, Gi have the property 
that there exist numbers do, a1, , Im With ag =0 such that for every number H 
there is a bounded subset Sx of S containing all a.c. curves y=y(x) lying in S, 
joining (xo, vo) to (X, Y), and having 


aofly| + aa G*[y] < dH. 


(1.6) The symbol y stands for a single variable, and the infinite interval 
—« can be subdivided into a finite number of subintervals I,, I2, , Ti 
(not necessarily in that order of precedence from — ~ to ~) such that for all (x,y) 
the following relations hold: 

w(x, y,p,r) = 0 if pel; and rel,,j7 >h; 
if pel; and rel; and p=r, 
whereo;=+1,7=1,---,k. 

(1.7) The functions gi(x, y, y’) are independent of x and y. 

(1.8) S is the entire (x, y)-space. 

(Observe that only in (1.6) do we restrict the number g of variables y+.) 
With these hypotheses we can state the following theorem: 


THEOREM 1. Under hypotheses (1.1), (1.2), (1.3), (1.4), (1.5), (1.6), (1.7), 
and (1.8), the class K contains a curve y=yo(x), (woSxSX), for which Fy] 
assumes its least vatue. 


To establish the theorem we first prove six lemmas. 


Lemma 1. Under hypotheses (1.2), (1.3), (1.4), and (1.5), the greatest lower 
bound pw of F|y| on the class K is finite, and all the curves of K for which 
¥[y]<u+1 Lie interior to a sphere Q. 


First, let C* be a curve of the class K. By hypothesis (1.5), the subclass Ki 
of K on which ¥(C) <¥(C*)+1 lies in a bounded part Sq of S. Let Q be a 
sphere (including boundary) large enough so that Sy is interior to Q. Then 
all curves of Ki, and a fortiori all curves C for which ¥(C) <u+1 7(C*) +1, 
lie in Q. This establishes the second statement of the lemma. The g.].b. of 
¥(C) on K, is clearly the same as its g.l.b. on K. By hypothesis (1.3), there is 
a number c such that f(x, y, y’)>1 if (x, y) is in QS and | y’| >c. On the 
bounded closed set [(x, y) in QS, |y’| Sc] the function f(x, y, y’) is continu- 
ous, hence bounded. So f(x, y, vy’) is bounded below for all (x, y) in QS and all 
y’; say f(x, y, y’) 2v. It follows that for all curves C of Ki 


x 
HC) & f = 1x 


and ¥(C) is bounded below on K,. Therefore its g.1.b. y is finite. 
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Lemma 2. If the integrands f, g all satisfy (1.1) and hypothesis (1.4) holds, 
then there exists a sequence of polygons y=¥n(x), such that 
Vn(Xo) and y,(X) =Y for all n, and 

lim =H, lim Tis j 1, 

Let Ci, C2, - - - be a sequence of curves of K such that 7(C,,)—>u. Since all 


the integrands satisfy (1.1), for each m there exists a curve C,*: y=y,*(«), 
(xo Sx<X), joining (xo, yo) to (X, Y), having y,*’ continuous, such that 


(1.8) | F(Cr*) F(C2) | < 1/n, | Gi(C*) GiC,) | < 1/n, 
Therefore and G(C,*)—47;. 
If C: y=y(x), (vox Sx<X), is such that y'(x) is continuous, we form a se- 
quence of inscribed polygons II,: y=y,(x) whose successive vertices are the 
points 


(xo, Yo), (xo + 5p, + 5y)), (xo + Rb», + kby)), (xX, Y), 


where 6, =(X —2»)/p. From the continuity of y’(x) it is easily seen that y,(«) 
tends uniformly to y(x) and that (neglecting the vertices of the polygons) 
yz (x) tends uniformly to y’(x). Hence for every €>0 there is a II, such that 
| ¥(11,) F(C)| <e and| G*(II,) — Gi(C)| <e, (j=1, - - - , m). Applying this to 
each C,*, we see that for each n there is a polygon II,: y=y,(x), (xoSaSX), 
with y,(%o) =yo and y,(X)=Y for which 


1 1 
(19) Fle) FE | <—, Gia) — GAC | < —- 


From (1.8) and (1.9) we obtain 
The lemma is therefore established. 


Lemma 3. Assume that the following conditions hold: 

(a) Q is a bounded closed set of points (x, y). 

(b) The functions f(x,y, y’) and g(x, y, y’) are continuous functions of their 
arguments for all (x, y) in Q and all y’. 

(c) fis non-negative for (x, y) in Q and all y’. 

(d) For every positive number N there is an M =O such that if (x, y) is inQ 
and | g(x, y, y’)| 2M, then f(x, y, y’)=N|g(x, y, y’)|. 

(e) The a.c. functions z=2,(t), (€nSt<b,), represent a sequence of curves 
{C,.} Lying in Q and such that the integrals #(C,) are bounded, and the functions 


q 
iy 
& 
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z,° (t) all satisfy the same Lipschitz condition.* 
Then the integralst 


f G(zn(t), Zn(t))dt 


are equi-absolutely continuous, in the sense that for every €>0 there is a 6>0 
such that if E is in [an, bn] and mE <6, then 


J ee. Z,)dt 


By hypothesis, there are numbers H, L such that F(C,) <H and ,° (#) SL. 
Let € be a positive number, and let N =2H/e. Hypothesis (d), written in 
parametric notation, informs us that there is an M 20 such that 


F(z, 2) = N|G(z,4)| if zeQ and -|GG,4)| = M2. 


Therefore, since F =>0, we have for all z e Q and all ¢ with 2°>0, 
F(z, 4) + = N|G@, 3) |. 
In particular, since L = é,9 (#) >0 almost everywhere in [a,, b,], the inequality 
F(Zn, én) + MNL = F(2Zn, én) + = N| G(Zn, | 
holds for almost all tin the interval [a,, b, |. Now take 6=e/2ML. If E is in 
[a,, b,] and mE <6, then 


ny on = ny bn d = y-1 ny a MNL\d 
inat| in) | dt < N J ve én) + 


IIA 


bn 
F(Zn, + MNL-mE | 


n 


lA 


N-[H + NML56] = «. 


The lemma is therefore established. 

Remark. Clearly it would be enough to assume in place of (c) that 
f(x, y, y’) is bounded below for (x, y) e Q and all y’, and to assume in place 
of (d) that for each N >0 there is an M>0 and a k independent of M and N 
such that if (x, y) e Q and | g(x, y, y’)| > M,then f(x, y, y’)=N| g(x,y, 9’)| 
For if k’ is a lower bound for f(x, y, y’) for (x, y) e Q, we take K to be the 


* This condition on 2? (¢) certainly holds if 2° (¢)=¢; that is, if the C, are all represented in the 
form y=¥(x), (an&xSb,). As always, we tacitly assume that z9’(#)>0 for almost all f. 

t F(z, 2) and G(z, 2) denote, respectively, the parametric integrands associated with f(x, y, y’) 
and g(x, y’). 
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larger of k and —k’. Then the functions f(x, y, y’)+K and g(x, y, y’) satisfy 
the hypotheses of Lemma 3. 


Lemma 4.* Assume that the following conditions hold: 

(a) Qis a bounded closed set of points (x, y). 

(b) f(x, y, y’) is continuous for all (x, y) in Q and all y’. 

(c) im jy’ | +20 f(x, y’)/| y’| uniformly for all (x, y) in Q. 

(d) {C,} is a sequence of a.c. curves y=¥,(x), (an<x<b,), lying in Q and 
such that the integrals #(C,) are bounded. 

Then the functions y,(x) are equi-absolutely continuous. 


The hypotheses of Lemma 3 are satisfied if we take g(x, y, y’)=|y’|; so 


the integrals 
fl ax 
E 


are equi-absolutely continuous. Let ¢€ be a positive number. There is a 6>0 
such that the integral above is less than e if E is in [a,, 6,] and mE <6. If 
(a1, 81), -- - , (ax, Bx) is a set of non-overlapping subintervals of [a,, b,] hav- 


ing length (6;—a,) <4, then 
B; 
f V¥n(x)dx 
This establishes our lemma. 

Lemna 5. Let hypotheses (a), (b), (c), and (d) of Lemma 3 be satisfied. Let 
{yn(x) } be a sequence of a.c. functions all defined on the same interval [xo, X] 
and converging everywhere in [xo, X | to a limit yo(x). Suppose further that ¥,(x) 
tends to yo(x) for almost all x in [xo, X]. Then lima... Glyn]=Glyo] and 


lim inf,.. ¥[yn] =F [yo]. Moreover, if hypothesis (c) of Lemma 4 holds, the limit 
curve y =o(x) is a.c. 


k k k 


For almost all x we have y,(x)—yo(x) and ¥n(x)—o(x); so for all such x 
the limit of f(x, yn, jn) is f(x, yo, 90), and the limit of g(x, yn, jn) is g(x, Yo, Yo). 
The function f(x, y, y’) is non-negative; therefore by the lemma of Fatouf 
we obtain 


x x 
f fl, 90, jaz. 
Zo 


zo 


* Lemmas 3 and 4 are closely related to some theorems established by M. Nagumo (Ueber die 
gleichmdssige Summierbarkeit und ihre Anwendung auf ein Variationsproblem, Japanese Journal of 
Mathematics, vol. 6 (1929), pp. 173-182). 

t P. Fatou, Séries trigonométriques et séries de Taylor, Acta Mathematica, vol. 30 (1906), p. 375 
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The integrals of the g(x, yn, jn) are equi-absolutely continuous functions of 
sets, by Lemma 3; so by a known convergence theorem 


x x 


Zo 0 
The final conclusion is obtained at once from (1.10) if we let n>. 

Lemna 6. If y=yn(x), (w1S"24%2), is a sequence of real-valued functions 
defined and convex on the interval x;Sx% S422, and the y,(x) converge uniformly 
to a limit function yo(x) on % S then lim n(x) = ¥o(x) for almost all x. 

It is easy to see that yo(x) is also a convex function. Hence each derivative 
y} (x), (7 =0, 1, 2, - - - ), is defined almost everywhere on [x1, x2] and is mono- 
tone increasing. Let E be the set of measure x2—2,; on which all these deriva- 
tives are defined. Let x» be a point of E. If h>0 is such that x+hS%, then 


(1.11) Yn(x%o + hk) — Yo(xo + — yo(xo) 
n> © h h 


But since y,(x) is convex, we know that y,/ (xo) S [yn(xo+h) —yn(xo) |/h. 
Hence by this and (1.11) we obtain 


yo(xo + kh) — yo(xo) 


(1.12) lim sup yx (%o) S F h>o. 

Now let 4-0; this yields 

(1.13) lim sup yn (%o) S yo (x0). 


Repeating the argument with 4<0, we find 
(1.14) lim inf = yo (Xo). 


Inequalities (1.13) and (1.14) show that y,’ (a9)—>yd (xo). Since x» is any point 
of E, this establishes the lemma. 

2. Proof of the theorem; examples. The preliminaries being disposed of, 
we take up the proof of the theorem. Let yw be the greatest lower bound of 
¥[yv] on the class K; by Lemma 1 this is finite. By Lemma 2, we can select a 
sequence of polygons II,*: y=y,*(x), (vo<«#<X), joining (xo, yo) and (X, Y) 
and such that 

Suppose that AB and BC are consecutive sides of one of the polygons II,*, 

having respective slopes a and 8. Let D be the fourth vertex of the parallelo- 
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gram having AB and BC as sides. By “interchanging” AB and BC in the 
polygon II,* we shall mean (as in I) the operation of forming the polygon II,’ 
which has all the sides of II,* except AB and BC, and has sides AD and DC 
to replace them. From (3.13) or (3.18) of I we know that if w;(x, y, a, 8) 20 
for all (x, y) in the parallelogram ABCD, then 


F(ADC) — F(ABC) = 0; 


whence (II, ) < ¥(I1,*). Since by hypothesis (1.7) the integrands gi(x, y, y’) 
are independent of x and y, it is obvious that ) = Gi(II,.*), (7=1, - - -,7). 

In particular, if the slope a of AB belongs to an interval J; and 8 belongs 
to an interval J, with h<j, then w;(x, y, a, 8) =0 for all (x, y), by hypothesis 
(1.6). So on II,* we search for the first side (that is, side with the least x) 
whose slope belongs to the interval J. If this is not already the first side of 
II,*, we can interchange it successively with all preceding sides so as to bring 
it to first place. These interchanges do not increase the integral ¥ and leave 
the integrals G/ unchanged. Next we locate the second one of these sides 
whose slope belongs to the interval J;. We can interchange this with preceding 
sides (if any) so as to bring it to second place; the integral 7 is not increased, 
and the integrals G’ are unchanged. Proceeding thus, we finally bring all 
those sides of II,* whose slopes are in J; into first, second, - - - , places 
in unbroken succession. Next we locate the sides whose slopes are in Js, 
and bring them together after the sides whose slopes are in /,; and con- 
tinue so until all intervals 7; have been considered. We thus have a polygon 
II,’ joining (ao, yo) to (X, VY), having F(II,’) not greater than 7(II,*), and 
such that =G*(IL."), G@=1, - -- , m). Moreover, if AB and CD are 
sides of II,’ such that the slope of AB is in J; and the slope of CD is in J, with 
h>j, then AB precedes CD on the polygon. 

Now for any particular number h/ of the set 1, - - - , we consider the ag- 
gregate of sides of II,’ whose slopes belong to J,. Suppose, to be specific, that 
the number a; of (2.5) is +1. If AB and BC are consecutive sides of the aggre- 
gate having respective slopes a and 6, and 8 <a, then by hypothesis (1.6) we 
have w,(x, y, a, 8) 20 for all (x, y). So, by (3.13) of I, if we interchange AB 
and BC, the value of ¥ is not increased. Therefore in this aggregate of sides 
we seek one having least slope. It can be interchanged successively with all 
preceding sides so as to bring it to first place in the aggregate; in the inter- 
change the value of 7 is not increased and the values of the G’ are unchanged. 
Of the remaining sides of the aggregate we seek the one with least slope; this 
can likewise be brought to second place in the aggregate. Proceeding thus, 
we find that we can rearrange the sides of the aggregate so as to have their 
slopes monotonically increasing; the rearrangement leaves the G’ unchanged 
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and does not increase the 7. If the value of o, had been —1, we could have 
carried out a rearranging process so as to have the slopes of the sides steadily 
decreasing instead of increasing. 

This process having been carried out for each value of h, (h=1,-+-, p), 
we arrive finally at a polygon II,: y=yn(x), (voSa%SX), joining (xo, yo) and 
(X, Y), having F (In) and Gi(tl,) = G=1, m). The 
polygon II, consists of at most k polygonal arcs on each of which the slope is 
either monotonic increasing or monotonic decreasing; that is, II, is composed 
of at most k convex or concave polygonal arcs. 

Let us suppose that this has been done for each II,*. For all the polygons 
II,, thus obtained the sum 


(Tn) + G*(I,) S + aa G*(I1,") 


is bounded above, since the sequences F(II,*) and G'(II,*) converge. So by 
hypothesis (1.5) all the polygons II, lie in a bounded set. By Lemma 4, the 
functions y,(x) are equi-absolutely continuous, hence equi-continuous; hence 
by Ascoli’s theorem it is possible to select a subsequence which converges 
uniformly to a limit function yo(x), (xo<*<X). We suppose that this sub- 
sequence is the whole sequence {y,(x)}. 

For each we can choose points %o,n=%o, such that 
for *:-1,n Sx <4%;,, the derivative y,/ (x) is in J; whenever it is defined. (If y,! 
is never in J;, then X;,n=2i-1,n-) It is possible to choose a subsequence of yn 
(we suppose it to be the whole sequence) for which x;,, tends to a limit 2;,o as 
n—. On each interval interior to x;~1,.S*%<4;,9 the convex (or concave) 
functions y,(x) tend uniformly to yo(x); hence yo(x) is convex (or concave) 
on the interval x;_1,9 <x <4;,o. By Lemma 6, on every interval (a, 8) interior to 
[x:-1,0, X:,0] the derivative ¥,,(«) tends almost everywhere to ¥o(x). Hence the 
derivative of y,(x) tends to ¥o(x) for almost all x in [o, X]; and by Lemma 5 
yo(x) is a.c. and 


Flyo] lim inf F[yn] <u, Gilyo] = ie Glyn] 


But yo(x) is in the class K; so ¥[yo] =u by the definition of u. Hence ¥[yo] =x, 
and the theorem is proved. 

Examples. (1) An example of a function f(x, y, y’) such that w,(x, y, ~, q) 
>0if p>qis 


I(x, 9’) = + + 


Here f,=0, f, =(1+~’?)"/2; so (1.1) is satisfied with a=5=1, b=0. Also, 
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w(x, y, p, g) = p(l + g?)!/? — g(1 + 
p q 
= {(1 2) ]1/2 
+ p )¢ +4q + p?)t/2 (1 


which is positive if p >g. Thus the whole interval — © <y’< © can be taken 
as one interval J,, with o,; = +1, in (1.6). We take, for example, g! = (1+-y’?)!, 
g? =| y’|/2, Hypotheses (1.2) and (1.3) are obviously satisfied. Hypothesis 
(1.5) holds if we take a9 =a2=0, a,=1, for g'[y] is the length of the curve 
y=y(x). Let (xo, yo) and (X, Y) be any two points with x) <X, and let yj, 72 
be any two numbers. Then if the class K of a.c. curves joining (xo, yo) to 
(X, Y) and giving g'[y] and g?[y] the respective values 7, y2 is not empty, it 
contains a minimizing curve for 7[y]. 
(2) For another example we use the same integral 


1 
Fly] = f + (1 + ]dx, 
0 
but now we impose no side conditions and require that (xo, yo) and (X, Y) 
be (0, 0) and (0, 1), respectively. Hypotheses (1.2) and (1.3) again hold, as 
does (1.6). Hypothesis (1.4) is satisfied vacuously. To show that (1.5) also 


holds (with a)=1; there are no other a;) we use Schwarz’ inequality. For 
0 <x <1 we have 


om = (f i(a)dz) f 


z 1 1 1 1 1 
(y(~))? S min (x, 1 — » f y*dx, f y*dx. 
0 2 0 0 2 0 


Hence, again using Schwarz’ inequality, we have 


7[y] = J + 5%) + + — 1 


1 1 1 1/2 
2 fo fat sah 
0 0 0 


1 
f (1+ 9%)dx — 1 
0 


1 1 
y? — 4]dx. 


IV 
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The last integral satisfies (1.5) (with a>=1); so ¥[y] does also. (Note, how- 
ever, that for pairs of end points other than (0, 0) and (1, 0) the condition 
(1.5) may fail.) 

By Theorem 1 we see that in the class of all a.c. curves joining (0, 0) and 
(0, 1) there is a minimizing curve for ¥[y]. This result is not quite trivial; 
for the integral ¥[y] is not even quasi-regular, since 


¥, = 2+ y(1 + 


which is not invariant in sign. It is clear that a large class of non-regular prob- 
lems comes under our theorem, for w; involves only the partial derivatives f, 
and f,, and if f satisfies (1.6) so does f(x, y, y’)+(y’) for every continuously 
differentiable function ¢(y’). 

(3) For a third example we take f(x, y, y’) =ey’?. Here 


w(x, ¥, = — = evpr(r — p). 
So if we take J; to be the interval — » <y’<Oand J; to be 0 <y’ < ©, we have 
p, r) > 0, pels,rely, 
while 


Hence (1.6) holds with o,=02=—1. If, for example, we take G’ and G? as 
in the first example, the class K contains a minimizing curve for ¥[y]. This 
curve will consist either of two concave arcs, on the first of which y’ <0 and 
on the second of which y’ 20, or else it will consist of just one concave arc 
on which y’ does not change sign. 

3. Integrals in parametric form. Related to Theorem 1 there is a theorem 
on integrals in parametric form; but it applies only to a very restricted class 
of integrands. The hypotheses on these integrands will be the following: 


(3.1) The function F(y, x’, y’) is independent of x and of the sign of x’: 
F(y, — x’, 9’) = #’, y’). 
(3.2) The functions Gi(x’, y’), (j=1, --- , m), depend only on |x'| and y’: 
Gi(— x’, = G(x’, y’). 
(3.3) There is a set of constants do, a1, Gm with ag =0 such that for every 
number H all the rectifiable curves C beginning at a fixed point and having 
SH have lengths less than a constant Ay. 


(3.4) The interval —1/2<0<7/2 can be subdivided into a finite number of 
subintervals I,,--- , I, such that: 


i 
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Or(x, y, Cos 41, sin 01, cos 02, sin #2) if and with >h, 
;Qr(x, y, COs sin 01, Cos sin 02) => 0 if and O2€1; and 0,2 42, 


where o;= +1, (j=1,---, R). 

(3.5) The class K of all rectifiable curves C joining two fixed points (xo, yo) 
and (X, Y) and having G'i(C) =7;, where the y; are fixed numbers, is a non- 
empty class. 

THEOREM 2. Under hypotheses (3.1), (3.2), (3.3), (3.4), and (3.5), the class 
K contains a curve Co for which F(C) is least. 

Let {C,} be a sequence of curves of K for which 7(C,) tends to the great- 
est lower bound yu of ¥ on the class K. Then the numbers ao7(C,)+aaG*(Cz) 
are bounded; so by (3.3) the lengths L, of the C,, are bounded. Consequently, 
all the C,, lie in a circle Q. For (x, y) in Q the function F(y, cos @, sin 6) is 
bounded below, say by—»; hence if C, is represented with arc length as pa- 
rameter by functions x=2,(s), y=n(s), (OSs<L,), we have 


L L, 
= f "F(a (— = — 
0 0 


and F(C,,) is bounded below. That is, the number x is finite. 

As we saw in §1, it is possible for each m to construct a sequence of poly- 
gons II,, joining (xo, yo) to (X, Y) and tending to C, such that (II) -~(C,) 
and G'(II,,)—Gi(C,,). We choose for each n one of these polygons (which we 
rename II,,) such that 


| — <1/n, | Gil.) — Gi(C,) | < 1/n. 


Then F(II,)—u and G‘(I1,)—;, G=1, - - - , m). We may assume, if we wish, 
that no side of II,, is parallel to the y-axis, since we can bring this about by an 
arbitrarily small change, causing an arbitrarily small change in F(II,,) and 
Gi(I,). 

Suppose that II, is defined in terms of arc-length by functions x=£,(s), 
y=nn(s), (OSsSL,). We define a new set of functions x,, y, by the relations 


yn(S) = mn(s), Xn(S) = Xo +f | £,(s) | ds. 
0 
Then yn(Ln)=Y and and 


Ln Ln 
F( Jn) = f Gi(,, jn)ds = 


Exactly as in the preceding proof, we find that the curve x=x,(s), y=yn(s) 
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can be considered to be composed of at most & arcs, each one of which is 
either concave or convex; the jth arc consists of segments along which 
(x, Yn )=(cos 8, sin #) with @ in the interval J;. Along such an arc y,! is 
either monotonic increasing or monotonic decreasing. Likewise x,’ is mono- 
tonic unless y,/ changes sign; so each of the (at most k) convex or concave 
arcs can be split into at most two arcs on each of which both x,’ and y,’ are 
monotonic. 

Let us change parameter from s to t=s/L,. The polygon x=zx,(s), 
y=y,(s), (OSsSL,), is then represented in the form x=X,(t), y=Y,(0), 
(0 <¢<1); and the interval [0, 1] can be split into at most 2k subintervals 
on each of which X’(#) and Y’(#) are monotonic. That is, X,,(¢) is concave or 
convex as a function of ¢ on each separate subinterval, and likewise Y,(é). 
The functions x,(s) and y,(s) satisfy a Lipschitz condition of constant 1; so 
X,(t) and Y ,,(#) satisfy a Lipschitz condition of constant L,, which is bounded. 
Hence we can select a subsequence (we suppose it the whole sequence) for 
which X,(#) and Y,(¢) converge uniformly to limit functions Xo(t) and Y,(é), 
respectively. As in the preceding proof, the interval 0</<1 can be split into 
at most 2 subintervals on each of which X,(é) is concave or convex, and 
likewise Y(t); and 


Xn(t) > Xo(t), Vn(t) > 


for almost all ¢. 
Since the derivatives of the X» and Y, are bounded, we have at once 


1 1 
f Gi(Xo, Yo)dt = lim Gi(X,, Y,)dt = lim Gi(I,) = ¥;, 
0 2 0 


1 1 
f F(Yo, Xo, Vo)dt = lim | F(¥n, Xn, ¥,)dt = lim F(I,) = 
0 n— 2 0 
The curve x=X,(t), y=Yo(¢) still is not a solution of our problem; for 
although we have Y,,(1) =y,(Z,) = Y for all 2, so that Yo(1) = Y, still we have 
only X,,(1) —x0=4n(Ln) so that How- 
ever, let us define ¢) to be that value of ¢ for which 


to 1 
Xo +f Xo(t)dt —_ f Xo(t)dt = X. 
0 to 


Such a ¢) exists, for the function 


t 1 
vot Xodt -f X odt 
0 t 
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is a continuous function of ¢, and as ¢ goes from 0 to 1, it goes from 
xo —|Xo(1)—2x0| to xo+|Xo(1)—axo|, and the number X is not less than 
the first of these and not greater than the second. If we now define 


yo(t) = Yo(t), 
axo(t) = xo + f g(t) Xo(t)de, 


where =1 for 0 and = —1 for tp <1, then x9(0) =x0, xo(1) =X, 
yo(O) =o, yo(1) =V; and since xo(t) = + Xo(#), 


1 1 
f F(yo, Xo, Yo)dt = f F(Yo, Xo, Yo)dt = 
0 0 


1 1 
f Gi(%o, sat = f Gi(Xo, Vo)dt = =1,---,m. 
0 0 


Thus the curve x=%o(t), y=yo(#) is in the class K and minimizes the inte- 
gral F. 
For an example let us take 


F = o(y) 4. G! = (ax’? + by’?) 1/2, G? = (x4 
where a and 6 are positive and ¢’(y) >0 for all y. Conditions (3.1), (3.2), and 
(3.3) are easily seen to be satisfied. We readily calculate 

OQr(x, y, COS 61, sin 01, Cos 42, sin 02) = ¢’(y)(sin 6; — sin 62), 


which is positive if —7/2<6.<0,<7/2. Hence (3.4) holds with k=1 and 
o1= +1. By Theorem 2, if (xo, yo) and (X, Y) are any two points and y:, y2 any 
two numbers such that there are curves C joining (xo, yo) to (X, Y) and hav- 


ing Gi(C) =7;, (j=1, 2), then there is a curve of that class for which 7(C) is 
least. 
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CONTRIBUTIONS TO THE THEORY OF MULTIVARIATE 
STATISTICAL ANALYSIS* 


BY 
WILLIAM G. MADOWT 


The increasing application of statistical methods in the social sciences has 
resulted in the consideration of hypotheses more complicated that those for 
which the theory of univariate statistical processes and the elementary theory 
of multivariate statistical processes suffice. One cause of the complications 
may be found in the fact that in the social sciences the results of an experi- 
ment are frequently vectors of several components rather than of one com- 
ponent. As examples we may note that the prices and quantities of a given 
commodity in p localities are vectors of p components, and that mental traits 
are generally tested by batteries of tests and not by a single test. Although 
the replacement of these vectors by some function, say an average, of their 
components is adequate for some purposes, yet in certain situations, any such 
function may be shown to be unsatisfactory. The fact that vector variates 
must then be analyzed requires the construction of a statistical theory which 
will facilitate that analysis. 

For the normal multivariate distribution, the only distribution for which 
results have been obtained, the beginnings of that theory may be found in 
the fairly intensive study, since Gauss, of least squares and multiple correla- 
tion, which culminated in the derivation by R. A. Fisher in 1928 of the distri- 
bution of the multiple correlation coefficient when the correlation parameter 
is not zero.t The general theory of vector variates has been developed since 
1931. It has included the analysis of variance of a vector variate and the 
theory of relations between sets of variables. Hotelling derived the distribu- 
tion of the generalization of Student’s ratio§ in 1931. Distributions of statis- 
tics occurring in the generalized analysis of variance were derived by Wilks] 
in 1932. In 1936 Hotelling stated and solved a general problem in the theory 
of relations between sets of variables.] We continue this analysis. 


* Presented to the Society, March 26, 1937; received by the editors October 7, 1937. §2 was read 
before the Institute of Mathematical Statistics, December 27, 1936; §3 was read before the American 
Mathematical Society under the title, The generalized analysis of variance. 

t This research was done under a grant-in-aid from the Carnegie Corporation of New York. 

t See tat The number in the brackets refers to the bibliography at the end of the paper. 

§ See [18]. 

|| See [34]. Other contributions to the theory are given in [2], [3], [24], [36], [37], [38]. 

{| See [20]. 
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In §1 there are given certain definitions and notation which are used 
throughout the paper. 

In §2 we prove several theorems,* the purpose of which is the establish- 
ment of a routine method of obtaining the distributions needed in multi- 
variate analysis. 

In §3 the theorems of §2 are employed to obtain some of the basic dis- 
tributions of the theory of vector variates. As illustrations of the uses of the 
methods, we examine the theory of generalized periodogram analysis,} derive 
the joint distribution of g? and z,f{ and the joint distribution of the two canon- 
ical correlations which exist when one of the sets contains two variables and 
the other set contains at least two variables, under the assumption that not 
more than one canonical correlation parameter differs from zero. We also 
analyze the generalized covariance, the tetrad difference,|| and other statistics 
occurring in the theory of relation between sets of variables. 

It will be noted that the variables are rarely expressed as deviations from 
a mean or regression function. This is permissible since Theorem 7 enables us 
to assume that those functions have been eliminated. Thus, if 2’ is the actual 
sample number, and if the mean value of the chance variable is a linear func- 
tion of parameters, then m, the sample number in terms of which our for- 
mulas are expressed, is equal to n’— p. 

1. Definitions and notation. The following definitions and notation will 


be used throughout this paper unless explicit statement to the contrary is 
made. 


If used as subscripts or superscripts, or as indices of summation or multi- 
plication, the letters z, 7, & will take on all integral values from 1 through p; 
the letters g, 4 will take on all integral values from 1 through i—1, the letter v 
will take on all integral values from 1 through m, and the letter y will take on 
all integral values from 1 through m. We shall denote by », the sum 
m+ (bo=0). 

The notation A XB, will be used for the combinatory product] of the two 
sets A and B. We shall write A? for A XA. Thus, if we denote by R' the total- 
ity of all real numbers, then R’ is the totality of all sets of v real numbers. 
If V is a subset of A XB and 6 is an element of B, we shall denote by V (6) 


* These are more fully described in the first few paragraphs of §2. 

t For a discussion of the Schuster periodogram in one variable, see [14] and [17]. 

t See [20], p. 333. These statistics are measures of the degree of dependence of sets of chance 
variables. 

|| M. A. Girshick has obtained some of these distributions under slightly less general conditions 
by other methods. They have not, as yet, been published. 


{| That is, A XB is the set consisting of all pairs (a, b) where a is an element of A and 0 is an 
element of B. 
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the totality of elements a of A, such that (a, d) is an element of V. 
If xi, x2, - X, are chance variables* with the distribution function} 


P(&, , and if 
En 
(1. 1) £2, En) f f f(m, Xn) dx 1d dxXn, 


where f(%1, - - - , %,) is a real single-valued non-negative summable function of 
the n real variables x, - - - , x, defined on R*, then f(m, - - - , Xn) is called the 
probability density of the chance variables x, x2, --- , xn. If w=pm and 


then the sets of chance variables x,,_,41, , Xp, Will be said to be independ- 
ently distributed. The symbol D(x, - - - , x,) will stand for “the probability 
density of the chance variablesf x, --- , Xn.” 


It is easy to see that if a set of chance variables has a probability density, 
then any subset of them will have a probability density. Furthermore if 
X1, , have a probability density and if n=,,, then a necessary and 
sufficient condition that the sets of chance variables x,,_,41, - - - , Xp, be inde- 
pendently distributed is 


D(x:, Xn) = II D(Xpy_141) Xpy)- 


If (a1, - - - , X,) is a measurable function§ of x, -- - , x, on R”, then the 
symbol €{¢(m:, - -- , xn)} will stand for “the mathematical expectation of 
o(x:,---,X,),” and, by definition, 


* For definitions of chance variables, see [7], p. 8 and [21], p. 20. Unfortunately there are 
differences in the terminology used in papers in statistics and in probability. To avoid confusion, 
one must remember that variate, statistic, and chance variable are synonyms, as are probability 
density and distribution. Writers on statistical subjects sometimes use the term cumulated distribu- 
tion for distribution function. It is particularly to be noted that the term distribution function in the 
literature of the theory of probability is the integral of the distribution referred to in papers on the 
theory of statistics. 

For all sets of real numbers £1, - - , the probability that x1<&, Xn<énis 
given by P(&, &,-+-, &,). For a discussion of probability density and distribution function, see 
[7] and [21]. 

t In using D we, of course, assert that x1, - - - , xn are chance variables and have a probability 
density. 

§ We will usually extend the domain of definition of functions defined on a subset of R” to cover 
the set R” by setting the functions equal to zero outside their original domain of definition. 


4 

5 

4 

i 

€ 

* 


1938] MULTIVARIATE STATISTICAL ANALYSIS 457 


where the integral on the right-hand side is a Lebesgue-Stieltjes integral.* 
The positive definite quadratic forms g and gq, will be defined by the equa- 
tions 


= otix,x; 
if 
and 


respectively. The matrix of g and q, will be denoted by (o~'). The determinant 
of (o—') will be denoted by o—. The inverse of (—') will be written (¢), and 
the elements of (c) will be denoted by o;;, where o;; is the cofactor of o*/ in 
a~! divided by o—'. The determinant of (c¢) is, of course, c. 
The function N(x; c) will be defined by the equations 


N(x; c) = exp [— x?/2c], 
N(x; c) = 0, 
The functions V((x); (¢)) and N((x,); ()) will be defined by the equa- 
tions 
N((x); (0)) = exp [— q/2], 
N((x»); = exp [— g,/2], <a < om. 
Of course the elements o;; may assume any values as long as the requirement 
that g and q, are positive definite is satisfied. 

If D(x) = N(x; a”), then x will be said to be normally distributed with vari- 
ance a? or to have a normal distribution with variance o?. If D(x, ---, xp) 
=N((x); (¢)), then x, ---, x, will be said to have a normal multivariate 
distribution. 

The function G(x; n, c) will be defined by the equations 

0, c) = 1, om, 
G(x; m, c) = exp [— x/2c], 
0< xc am, 


and G(x; ”, c) =0 for all other values of x, , c. The function G(A; n, c) will 
be defined by the equation, 


G(A; n, c) -f n, c)dx, A> —n+1. 
0 


* If a measurable function y=¢(m, -.- - , xn) has been defined, we shall mean by y the function 
$(x1, Xn). 
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Then it follows that 
+ d)/2) 
T'(n/2) 
If D(x) =G(x; n, 1), then x will be said to have a x? distribution with m de- 
grees of freedom. In that case the distribution of x/2 will be said to be an 
incomplete I'-function distribution of index n. 
The function D((x); ()) will be defined by the equations 
I'(n/2) 
IT ( 1 ») 
+ 


G(A; n, c) = (2c)? 


and D=0 otherwise. The function D((A); ()) is defined by the equation, 


Dio); (m) = f 


0< zp 


P 
f ( Il D((x); (n))dx,-- + dx 
0< 2 


Then letting \=,+ - - - +A,, we find that 


Bia); = fy 

If Xp) =D((x); (m)), then x, - - - , xp will be said to have an in- 
complete Dirichlet distribution of indices m, --- , mp41. If p=1, then x, and 
x2 will be said to have an incomplete §-function distribution of indices 
and m2. 

2. General theorems. One of the most important theorems of univariate 
statistical analysis is that which states necessary and sufficient conditions 
that a set of quadratic forms in normally and independently distributed 
chance variables be themselves independently distributed with a joint proba- 
bility density which is a product of x* distributions.* 

The fundamental part of this theorem may be stated as follows: 


THEOREM 1. (Cochran’s theorem.) Jf x1, %2, +--+ , Xn are normally and in- 
dependently distributed, with zero means and unit variances, if qi, G2,°** 5 Ym 
are quadratic forms in x1, X2,- ++, Xn, with constant coefficients such that 


(2.1) = Lx, 
v 
and if the rank of qy is ny, then a necessary and sufficient condition that there 


exist n linear functions with constant coefficients, 


* The theorem is stated by W. G. Cochran [6], p. 178, and essentially stated by R. A. Fisher 
[12], p. 97. 


& 
» 
, 
3 
% 


1938] MULTIVARIATE STATISTICAL ANALYSIS 


(2.2) > 


which are normally and independently distributed with zero means and unit vari- 
ances, and are such that 


Py 
(2.3) > 2 


v=py—1t1 
is* 
(2.4) Pm = 


The necessity of the condition (2.4) is obvious. For a detailed proof, of 
the sufficiency of the condition, the reader is referred to [6], p. 178. It may 
be remarked, however, that the proof depends on the fact that in reducing 
the g,’s to sums of squares, the m linear functions thus obtained must be 
linearly independent or contradict (2.1). 

In this section we prove several theorems by means of which many prob- 
lems in multivariate statistical analysis may be solved. Of these, Theorems 7, 
8, and 9 are generalizations of the Fisher-Cochran theorem. 

A problem of statistical analysis may be formulated in the following way: 
If the distribution of several functions of certain chance variables has been 
derived, then what is the distribution of the same functions of other chance 
variables? 

As an example of this problem we may cite the extension of the distribu- 
tion of the multiple correlation coefficient of variables having a normal multi- 
variate distribution with zero multiple correlation parameterf to the distri- 
bution of the multiple correlation coefficient of variables having a normal 
multivariate distribution, the multiple correlation parameter of which need 
not vanish.{ 

In Theorem 2 the problem is solved under fairly general conditions. In 
particular the problem is solved when the functions are sufficient statistics§ 
for both sets of chance variables. Various properties of sufficient statistics 
are then studied in Theorems 3 and 4 in order to determine to what extent 


* The algebraic content of this theorem may be stated as follows: If the real quadratic forms 
25° * * Imi * * * Xn are such that and if the rank of q, is n,, then a necessary 
and sufficient condition that there exist an orthogonal transformation S,=2 wXy, (u=1,+++,m), such 
that is Pm=nN. 

+ This distribution was derived by R. A. Fisher. See [10], p. 91 and [11], p. 811. 

t This distribution was derived by Fisher four years after the appearance of the preceding dis- 
tribution. See [13], p. 660. 

§ See [28], §4, for a general definition of a system of sufficient statistics. 
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the possession of a set of sufficient statistics defines the underlying chance 
variables and to derive an expression for the probability density of any sys- 
tem of sufficient statistics. . 

In Theorem 5 we show that certain useful transformations may be treated 
as though they were the results of iterating linear transformations. 

Frequently, in statistical investigations one is confronted with the neces- 
sity of finding distributions subject to the condition that certain variables are 
held constant or the condition that the effect of certain variables be previ- 
ously eliminated. In such cases one must learn if the distributions obtained 
are functions of the values of the variables held constant and of the distribu- 
tion of these variables. In Theorem 6 we shall provide a means of obtaining 
this information in situations which essentially depend on the normal 
distribution. 

The following well known lemma will be used in the proofs of Theorems 2 
and 4: 


Lema 1. Let the real single-valued Borel-measurable* functions 
(2.5) Yi = gilts, , Xn) 
be defined on and let (1, - - , Vp) be a single-valued measurable function of 
Vi,°**, Vp defined on R°. Upon substituting from (2.5), it is seen that 
Vp) is @ single-valued measurable function, $(m,---, Xn), of 
Then, if x:,- ++, Xn are chance variables, it follows that yi, --- ,¥p, where 
yi=gilxi, +++, Xn), are chance variables and 


A proof of this Lemma is given in [21], p. 35. A similar result is true when 
the functions (2.5) are measurable. It is noted that if 


“1, 


where yu; is a real single-valued measurable function of y, --- , yp», then we 
have the following equation, which is often used in applying the Fourier 
transform theory in the derivation of probability densities, 


* For a definition and discussion of Borel-measurable functions, see, for example, [31], chaps. 
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where \,(%1,---, %n) is the real single-valued measurable function of 
Xn Obtained by substituting from (2.5) in 


THEOREM 2. Let 


and let 
D(zi,---, Xn) = F(zi,---, 


where F vanishes whenever f vanishes. Let the real single-valued Borel-measurable 
functions y1, +--+, Vp be defined as in (2.5), and let y{,---, yp be the same 
functions of , Xn. Suppose that, when 
we substitute for y; and yi , 


and 

(2.7) = K(yi,--+, 

Then if 

it follows that 

5 

where whenever k =0. 

Proof. For any Borel-measurable set A of R?, let 

5 ¥p) 
¥p) 


= 


if y{,---, yp is an element of the set A and k(y/,---, yp) #0, and let 
¥¢) =0 otherwise. Also let 


if x{,---, isan element of A~! and f(xj, - - - , x) 0, where is the 
set of values of x/,---, 2, such that* y/,---,y, is an element of A, and 
let =0 otherwise. 

Then, from Lemma 1, for all Borel-measurable sets A of R? 


* The fact that y;,--~-, ¥p are Borel-measurable functions of xi,°*+,%% causes A~ to bea 
Borel set of Rn. 
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A R(ti, ty) 


F(x1, +, %n)dx, d%q. 


Since the integral on the right is equal to the probability that y/, - - - ,y, is 
an element of A, the theorem follows from (1.1). 

As an example of the possible uses of Theorem 2, let us consider the der- 
ivation of the joint distribution of variances and covariances of chance varia- 
bles which have a normal multivariate distribution.* 

Hotellingt has, by a simple geometric argument, obtained the joint dis- 
tribution of sample correlation coefficients when all the “true” correlation 
coefficients are zero. In this case, it is known that the variances are independ- 
ent of the correlation coefficients.{ We may therefore state Hotelling’s result 
more generally as follows: 


Let D(xu, , Xpn) [],N (aie; 1). If 
jr 


2 
a= 


then 


i 


where 


‘lp * ** 1 


We are able, by means of Theorem 2, to extend Hotelling’s result to de- 
pendent chance variables. 

Let =[],N((x.); ()). If a?, and w’ are the same func- 
tions of x/,,--- , x), that aj, 7;;, and w are of , Xpn, then 


1 
i 


ij 


(2.8) 


* This distribution was first found by J. Wishart [39], p. 38, and [4], p. 270. 

t See [19], p. 519. 

t An analytic proof of this statement and Hotelling’s distribution is readily obtained by means 
of Theorem 8. 
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This conclusion is immediate when one considers that in the notation of 
Theorem 2, U(xu, --- , Xpn) , Xn) =1, 


and 


The transformation into variances and covariances is obvious. 

The chief reason for the introduction of the following theorems on suffi- 
cient statistics is the desire to learn to what extent the requirements that 
(2.6) and (2.7) are satisfied restrict the applicability of Theorem 2. We will 
show, in the corollary to Theorem 3, that at least for sufficient statistics, using 
the definition based on (2.9), the chief value of Theorem 2 is that it permits 
the extension of probability densities derived on the assumption that certain 
parameters have fixed values, to probability densities which are functions of 
the values of those parameters just as in the preceding example. 

Let the chance variable x, have a probability density f(x,, 0:,---, 0,), 
(q¢<m), which depends on several parameters, 61, - - - , 0,, and let the chance 
variables x, - - - , x, be independently distributed. Let the real single-valued 
Borel-measurable functions 4, - - - , yp be defined as in (2.5). If 


(2.9) IL 41, 94) k(y1, » Vp, 94) -U(xi, Zn), 


where k(y1,---, Vp, and U(a,---, %,) are real single-valued 
measurable functions of their indicated variables, then y;, - - - , yp, are called 
a sufficient system of statistics* with respect to the estimation of 1, - - - , ,. 
Now let the chance variable x/ have a probability density F(x/ - - 
which depends on several parameters Ai, - - - , As, and let the chance variables 
xi,---,Xn be independently distributed. 
The question suggested by (2.6) and (2.7) is the following: If 


where the Borel-measurable function y/ is the same function of x/,--- , x, 
that y; is of a1, - - - , ¥,, and where K and /’ are real single-valued measurable 
functions of their indicated variables, then what is the relation between 
D(x,) and D(x; )? 

The answer to that question is contained, for the absolutely continuous 


* Sufficient statistics were first defined by R. A. Fisher; see [9], p. 319. Also see [26], where 
certain definitions of sufficient statistics are compared. 
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statistics and probability densities, in the corollary to the following deriva- 
tion of the form which D(x,) must have in order that a system of sufficient 
statistics exist.* 


THEOREM 3. Let f(x,, 01, --- , 0) be a non-vanishing continuous function 
of x», 01, + - ,0, on an open subsett V oft R'X R4, and let f be an absolutely con- 
tinuous function of x,on - - - , 02) for every set of fixed values, 09, ---,62, 
of 0, in ©. Set D(x,) =f(x,, 01, , 0g), and assume x1, , Xn to be 
independently distributed. Let (2.9) be satisfied, it now being assumed that y; 
is an absolutely continuous function of x, on an open subset B of R’ containing 
A, for fixed values of Xy-1, X41, °° Xn. we assume that the jacobian, 
J (51, - ++, 5p), where 


does not vanish on (B—Z)", where Z is a closed set of measure zero, for any of 
the C; possible selections 6,,---, 5, of p of the positive integers 1,---,n, 
then there exists an open subset V' of V such that the measure of V—V’ is zero, 
and J(61,-- +, 5») does not vanish on any of the set [V'(0:, -- - , |” for all 
possible sets of values of - , 5p. Furthermore, at each point x), 0°,--- , 
of V’, there is neighborhood N =w<r, where the definitions of w and r are 


w: <h, r: |0;—02| < hi, im 


in which there exist real single-valued absolutely continuous non-constant func- 
tions aj(x,) and y(x,) of x, and real single-valued continuous functions 
, 9.) and B(A,, --- , 04) of 1, -- , 04, such that in N 


9.) 
.10 
(2.10) = exp | + ¥(%) + --- |. 


Proof. It is easy to see that if we define V’ to be the intersection of V and 
(B—Z) x0, then V’ is open, and the measure of V—V’ is zero. Now let the 


* The result (2.10) has been obtained by Koopman and Darmois. See [23], p. 402 and [8], 
p. 1265. I have not been able to obtain the corollary to Theorem 3 from Koopman’s proof, which 
depends on a definition of sufficient statistics that includes the definition based on (2.9). It may be 
remarked that Koopman needs only the continuity of the functions y; whereas absolute continuity 
is assumed in Theorem 3. The proofs of Theorems 3 and 4 are generalizations of proofs for p=q=1 
in (2.9) given by Doob in an unpublished seminar lecture at Columbia University. R. A. Fisher 
has derived Theorems 3 and 4 in cases which are specializations of those here treated. See [15]. 

t The totality of elements in the sets V(6;, - - - , ,) will be denoted by A. 

t The values of x, are elements of R', and the values of 4, - - « , 0, are elements of an open sub- 
set of 


= 
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neighborhood WN be defined as in the hypotheses. Then the difference, 


[log f(x,, i, » 9q) log f(x, 


is a real single-valued continuous function of the variables x, - - - ,x,, 
69, 0%, on w"Xr* and is absolutely continuous in x, for fixed 
values of the other variables. Hence the equation 


» 9q) log f(x», 61, 


is valid throughout w" Xr?. Letting 6;=j7+1 in J(é,, - - - , 6p), which involves 
no loss of generality, we see that the equation 


ax, 61, » 9a) log f(xy, 0, 67)], 

where the a,; is the cofactor of dy;/dx, in J(2,---, p+1) divided by 
J(2,---, p+1), is valid throughout w"Xr?. Upon assigning fixed values, 
+, Of,---, from Xr to ---, Xn, +, Of, we find 
that the equation 


where 


h( x1) f(%1, 67°, » O¢°), 


Oy; 
a(m)= >> 
7 1 


flog f(i+1, 41, 9a) log f(x i+1, 67°, ,64°)], 
0 
= 
is valid throughout wXr. Then (2.10) may be obtained by integration of 
(2.11). 


465 
flog 
(og 
OX, 
flog 
ay, R41 
(2.12) 
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The absolute continuity and continuity of the functions may be easily 
proved. Obviously, if (2.10) is valid for all possible values of x, and 61, - - - , 04, 
then the statistics, }°,a;(x,), are a sufficient system of statistics with respect 
to the estimation of 6:,--- , 

It is clear that a theorem similar to Theorem 3 may be proved for multi- 
variate distributions. 

The answer to the question of the relation of densities determining the 
same sufficient statistics is contained in the following corollary: 


CorROLLARY. Let 


D(x; ) = F(x;, » 


and let f and F not vanish except on a set of values of x of measure zero for all 
values of 0, and , considered. Assume that D(x,) and D(x; ) 
satisfy the conditions of Theorem 3, the statistics y{,---,y, being the same 
functions of xi ,-- + ,Xn thaty:,---,y,are of x,---,Xn. Thenin N and N’, 
when N' is a neighborhood of values of x’, , Xs Satisfying the conditions of 
Theorem 3, the x’ values of N’ being those of w, the equations (2.10) and 


F( xy, * Ae) 


exp | > (Au, Xs) + B’(A1, ds) + (x )| 


(2.13) 
i 

are valid; hence D(x,) and D(x; ) differ only in the functions y, B, B; and y', B’, 

8} . Furthermore if 


Xn) = K(x, Za); 


then y(x) =7'(x), any constant difference being eliminated by means of B and B’. 


The proof is omitted. It depends on (2.12). 
We now assume (2.9) to be valid throughout R!X<® and find the joint 
probability density of the sufficient statistics gi, - - - , @» where 


1 


Integrating f with respect to x, we note that 8((,---, 0,) is a function 
5(61, -- , By) of the values of - - - , -- ,Bp(01, - , defined by 
the equation 


exp = 04) Jax. 


7 


& 
z 
i, 
{ 
§ 
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THEOREM 4. Let D(x,) be given by (2.10) for all x, except on a set where 
D(x,) =0. Then 


ith ity 
n nN 


Proof. Since, by Lemma 1, 


{exp [ix = E {exp 


n 


it follows that 


E{exp 
ity ity 


Then the proof is completed by using the Fourier transform theorem.* 

One of the uses of Theorem 4 is to find the distribution of statistics which 
are functions of sufficient statistics, and which may, in fact, be an equivalent 
set of sufficient statistics.f It is easy to see that 


E{ «;(x)} = 3g, » Bp), 


and that 
= Ef — Efaj(x)} ][ax(x) — Efax(x)}]} = By). 


The purposeof the following theorem is to make possible the simplification 
of our proofs of Theorems 6, 7, and 8. 


THEOREM 5. Let the real single-valued Borel-measurable functions, 
(2.14) = Xn), n> 1, 


be defined on Let 5 Voy depend only on Xp,, and let 
Yoy-1t+y be a linear transformation, having unit determinant, of 
Xpy_y+1) °° * » Xp, for almost all sets of fixed values of %1,--- , Xpy_,, where the 


* See, for example, [5], p. 191. 
Tt Two sets of sufficient statistics may be called equivalent if both satisfy equations of the 
form (2.9). 


4 
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transformation may depend on the fixed values of x1, - - , Xp,_, nd n=pm. Then 
if A is a Borel-measurable set of values of y1, + - - , Yn, and A~ is the set of values 
of X, such that is an element of A, it follows that is 
Borel-measurable and has the same measure as A. 

Proof. The proof is given for m=2. We shall use # for f;. It is well known 
that if A is Borel-measurable, then A- is Borel-measurable. Let us denote 
by Yn) the function which has value one if y;, - - - , is an ele- 
ment of A and which is zero otherwise. If we denote the measure of A by 
u(A), it follows that c4(y1, - - - , yn) is Borel-measurable if A is Borel-measura- 
ble, and 


R 
Now it may readily be shown that if ca(y1,---, yn) is Borel-measura- 
ble in R*, and we assign fixed values, y~,---, y to y1,---, Vp, then 


ca(y®,- V2, Yn) is Borel-measurable in R*-?.* From Fubini’s 
theorem, it follows that 


) 


(2.15 
** ay.| dy, +++ dy». 


In a precisely similar fashion, it may be shown that 


Inasmuch as the linear transformation having unit determinant is measure 
preserving, it follows that if our fixed values of 1, ---, x» are determined 
by means of the assigned fixed values of 1, - - - , yp and the linear transforma- 
tion, then, except for a set of measure zero, 


R 


R 


* See, for example, [31], p. 82 
t See [31], p. 77. 


] 
| 
| 
| 
| 
] 
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Since the transformation of - - - , x, into y:, - - - , yp is measure preserving, 
the proof is completed by transforming (2.15). 


Cororary. Let y(y1, Yn) be a real single-valued measurable function of 
on Then (yi, - Yn) is a real single-valued measurable func- 
tion Xn) of %1,---, on when we substitute from (2.14), and 


A A 


Proof. If A has finite measure, we may use Lemma 1 to complete the 
proof, and if A is not of finite measure, we express A as a sum of non-over- 
lapping sets of finite measure and then proceed as above. 

More general results than Theorem 5 and its corollary may readily be ob- 
tained. It is only necessary to require that A and A! be in 1-1 correspond- 
ence, to replace the words “a linear transformation having unit determinant” 
by the words “absolutely continuous functions with non-vanishing jacobian,” 
and to replace the words “and has the same measure as A” by the words 
“and has measure 


Aa 


J,= = py. 
Ox; 


Similar theorems may be proved when R” has a completely additive measure 
function defined on it,* and the functions (2.14) are measurable with respect 
to that measure function. 


The theorems stated next are of constant service in the analysis of the 
normal multivariate distribution. 


THEOREM 6. Let} p=pit + let 
po = 0; py = + dr, 
and assume that a positive integer, yo<m, exists such that for y>7o 


* See, for example, [31], chap. 1. 
In this theorem we do not define to be m+ 


| 
where 


470 W. G. MADOW [November 
where it is understood that gy is a function of all the a;;, (i,7 =py~+1,--+, py) 
and a;;=)>_,Xi,xj,. Let the real single-valued Borel-measurable functions 
be defined on R”", and let the equations, 
= (y) ° 
Vu = Pur (xn, Sigs + 1, Pes 


define a non-singular linear transformation of xi, Xin into Ya, +++ , Vin for 
fixed values of xu, - ++, Xp,_.n and each value of the transformation being 
orthogonal if y 

Then, if 


= 
it follows that 
y=7otl 
where 
and it is understood that g, is a function of all the b;;, (i, 7=p,it1,--+, py). 


Proof. The functions y;, have been so defined that the corollary to Theo- 


rem 5 may be applied. 
We come now to the generalizations of the Fisher-Cochran theorem. 


THEOREM 7. Let 
Xp) = N((x%); (o)), 
and let 


D(xu, Xpn) = [] D(x, , xp). 


Let, for each value of i, 7, and y the real single-valued function 


(2. 16) bijy = jv, 

be a bilinear form in x;, and x;, with constant coefficients, and let the matrix of 
b;;, be denoted by (d7). Suppose that 


€] 
2 
4 
4 
4 
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Then, if n, be the rank of (d7), a necessary and sufficient condition that there 
exist linear functions 2;, of xi, - , Xin Such that D(zy, - , Zp) =N((z,); (@)), 


and 
Py 
= 


v= py—1t1 
is 
(2 18) Pm = Nn. 

Proof. The necessity of the condition (2.18) is obvious. In order to prove 
that (2.18) is sufficient, let i=j=1. Then by Theorem 1, there exist 2 linear 
functions (2.2) which have the desired properties. Consider the system of 
linear functions 


the coefficients of which are those of the linear functions (2.2). The functions 
(2.19) are an orthogonal transformation of the chance variables x,, for fixed i 
and have the desired properties. 


Coro 1. Jf 
(2.20) y=l,-->, kk Sm, 
and if n= px, then the chance variables 
(2.21) Zins 


have density 


Il Il [T((ny —i+ 


y=1 i=1 
k 1 k n 
{ I] | exp | - 2, + |, 
y=1 2 i,j y=1 v= 
where | bij;y| is a p rowed determinant for each value of y. 
Proof. The condition (2.20) permits the use of Wishart’s distribution.* 


2. Let D(xu, , Xpn)=[],N (c)), and let the bilinear 
forms (2.16) have the property (2.17). Then if the linear functions 2;,, exist for 
t=1,--- v=1,--- » Pm-r, and if 


= LL NC); @)), 


* See equation (2.8) of this paper, or Wishart [39], p. 38, or Wishart and Bartlett [4], p. 270. 


a 
4 
4 
3] 
2 
4 
4 
2 
v 
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it follows that 


Pm-1; 


that the complete set of variables (2.19) exists, and that 
D(zu, Zpn) II N((2»); (c)). 


Proof. See [6], p. 181.* 
In Theorem 7 a direct generalization of the Fisher-Cochran theorem was 


considered. In Theorem 8, the generalization is one which permits the use of 
more general quadratic forms than could be examined by means of Theorem 
7. However, transformations of the chance variable (2.21) such as that of 
Wilks [34], p. 484, equation (30), will provide another derivation, by the 
moment method, of many of the distributions which we shall obtain, in §3, 
by means of Theorems 8 and 9. 


THEOREM 8. Let the +m, chance variables x;,, (t=1,--- , p; 
v=1,---, ms), have probability density []?_,[ [%,N (xi; 1). Let the real single- 
valued Borel-measurable function 


of Xu,°-*, Xin, defined on R™+-':+"», be, for each pair of values of i and j, 
(i=1,---, p; j=1,---, mi), and for almost all sets of fixed values of 
Xiu, @ Quadratic form in , Xin; Of rank n;;, and let 


(2.23) t=1,---,p. 


j=1 


Then a necessary and sufficient condition that 


P mi 
. 24) D(qu, = nis, 1) 


i=1 j=l 


j=1 

* It is not difficult to generalize Theorems II and III of Cochran ([16], pp. 179, 181) by following 
the same procedure that has sufficed for the generalization of Cochran’s Theorem II. It may be re- 
marked that a necessary and sufficient condition for the satisfaction of equation (3.1) of Wilks 
([37], p. 326) is provided by the generalization of Cochran’s Theorem III. The generalizations of 
Cochran’s Theorems I and IT are also very useful in the moment approach to the theory of multi- 
variate statistical analysis. 


is 
mi 
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Proof. Necessity. The additive property of the x? distribution suffices; for, 
>” ,.x?, has the x? distribution with m,; degrees of freedom, and from (2.22) 
and (2.23), has the x? distribution with ma+me+ --- de- 
grees of freedom. 

Sufficiency. We obtain the analogue of Theorem 1 for the hypotheses of 
Theorem 8. For each value of 7, assigning fixed values to xn, +--+ , ¥i-1,n;-1) 
we obtain from Theorem 1 »; real linear functions 


(2.26) Sn = 


which are such that the corresponding chance variables have density 


Il N (Ziv; 1); 


v=] 


and for the fixed values of , 
2 ° 
Dd Ziv = gis, 


where v runs from --- +;,;.1.+1 through --- 

The coefficients ci,, of the linear forms (2.26) are real single-valued 
Borel-measurable functions of the coefficients of g;; for fixed values of 
Let c,(w) be the same function of the functions 


ij 1 


that cj, is of the coefficients of the quadratic form having constant coeffi- 
cients. Since Borel-measurable functions of Borel-measurable functions are 
Borel-ineasurable, it follows that the functions 


ni i 
= Cur(@) Xin 


are Borel-measurable functions of xu, --- , %in;. For almost all sets of fixed 
values of xn, , ¥i-1,n;_,, it is clear that yu, - - , Yin, is a linear orthogonal 
transformation of xj, - - - , Xin, Hence the hypotheses of Theorem 5 are satis- 
fied. Furthermore, the functions y;, have been so defined that 


2 


where v runs from -- +2;,;-1+1 through m+ --- +,;. Then by the 
corollary to Theorem 5, it follows that 


a 
ny 
f 
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i=1 


and the proof is completed by using the additive property of the x? distribu- 
tion. 

THEOREM 9. Let D(xu, , Xpn) =[],N((x,); (@)). Let m= --- =n,=n 
in Theorem 8, and let the functions (2.22) satisfy (2.23). If, for some i and j, 
40, let Ypm,) be a real-valued Borel-measurable function of its 
indicated variables such that 


Then a necessary and sufficient condition that 


v=1 
1 


exp | - 


D tom) | 


(2.28) =n. 
v=1 


Proof. Necessity. The necessity of the condition (2.28) may be demon- 
strated as in the proof of Theorem 8. 


Sufficiency. If, in Theorem 2, we let /=/’=1, 


1 
Qpm,) = exp E = D tm), 


i,j 


and 
1 
k(qu, Ipm,) = exp | - visqu, tom) |; 


then the functions (2.22) are such that (2.6) and (2.7) are true. Hence, the 
desired result, (2.27), follows from the straightforward application of Theo- 
rems 2 and 8. 

It is apparent that many distributions of functions of chance variables 
having a normal multivariate distribution may be derived by using Theorem 
9. Some of these will be considered in §3. 

3. Vector variates. In this section we apply the methods developed in §2 


474 
(2.27) — 
1s 
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to various problems in the generalized analysis of variance, relations between 
sets of variables, and generalized periodogram analysis. 

In order to do this with a minimum of duplication, two theorems in the 
theory of quadratic forms are first stated. These theorems and their corol- 
laries are useful in the theory of univariate statistical analysis. In Theorems 
12 and 13, the previous theorems are used to obtain joint distributions of 
functions which are quadratic forms in certain variables for fixed values of 
other variables. These latter distributions are then transformed in order to 
derive distributions which are useful in the theory of vector variates. The 
notation and terminology have already been given in the first section of this 
paper. 

Let 


let c'=o;/o;1, and let oi*‘ be the cofactor of oj, in o; divided by ai, 
(j,k=1, cee 1), = Git, 
Set 


v 


and assume that a;, a‘, and a/*‘ are the same functions of the a;; that a;, o', 
and o/*# are of the o;;. We shall refer to any determinant a; as a generalized 
sum of squares since 


(A,)?, 


where 


X2ue 


(3.1) A, = 


Min, Ving * Ving 


and the summation is for all the distinct selections of zintegers 1, ue, Mi 
from1,---,m. 

We remark that if 8,, is the regression coefficient of x, in the “true” re- 
gression function of x; on x1, Where %», (p2i), are as- 
sumed to have a normal multivariate distribution, then 


O11 O12°** 
O21 
Mm 
» 
— 
= 
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Similarly, if b;, is the regression coefficient of x, in the sample regression func- 
tion of x; on %, , then 
qivt 
big 

In the following theorem it is shown that the coefficients of a transforma- 
tion by which a definite quadratic form may be reduced to a sum of squares 
are the regression coefficients. It is necessary to emphasize that although 
much of this material is known, yet no complete treatment seems to exist. 


THEOREM 10. There exist p linear forms 


(3.2) = — Biot, 


such that 


The proof is omitted. It is noted that the transformation (3.2) has unit 
determinant. 


COROLLARY 1. Let 


D(x», » N((%); (c)), 
and let 


Then if Viv =Xiy—) pBioXgr, it follows that 


= N(y; 


= II II D(yir). 


The proof is omitted. 

In Theorem 10 and Corollary 1 it was shown that variables which have a 
joint normal distribution may be expressed in terms of variables which are 
distributed independently in normal distributions. In the following corolla- 
ries it will be shown that certain functions of the dependent variables are in- 
variant under the transformation into independent variables and hence have 
the distributions of the same functions of independent variables. 


g 
and 
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Coro.iary 2. Let a/ be the generalized sum of squares with the elements 
Then 
(3.3) {f= a. 


Proof. Equation (3.3) is obvious when a; and a/ are written as sums of 
squares of determinants as in (3.1). 


3. If /a}_1,* then, except for a set of measure zero, 


(3.4) = ais >» Wigs 
where ’ 
(3.5) Wig = > P 


and the coefficients 


g-1 
jul 


are such that 
(3.7) DX = Son. 
Proof. It follows from the definition of a’* thatt 
(3.8) = ai — Giga 
gh 


From Theorem 1 and (3.8), it follows that 


2 
ais — = win, 


for almost all sets of fixed values of yu, - - - , Yi-1,n, Wheret 


Wig = (a, -> 


k=l 


(3.9) 
= 


* The functions a’i#é and bjg are to be defined as are a/* and b;, but with aj, replacing ajz, 
(j, k=1, ++ +, i). From Corollary 2 it follows that a’*=a’‘, 

T It is, of course, well known that two alternative expressions for a’* are a’i=ay—), otigdt, and 

t The function b;., is the coefficient of x, in the sample regression equation of xj on %, °° , XQ, 
and b’;., is the same function of the yy. 


| 
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Hence, equations (3.4), (3.5), and (3.6) are true. Equation (3.7) may be veri- 
fied by direct computation, or by employing the known properties of devia- 
tions from sample regression functions to avoid that computation. 


CorRoLuary 4. The function wi, may be written as 


i-1 
(3.10) Wig = be. Bik Bi). 


k=g+1 


Proof. From (3.9) it follows that 


Wig = 


Now it may be shown that if 
, 
Ug 


* Gg-1,9-1 

* Gy, g—1 


where g and r are positive integers such that g <q Si; g <r Si, then 


r 
u=9 
where K,, is the same function of the a;; that Ki, is of the Oey. Then (3.10) 
is a consequence of the fact that 
K 
Dg k=g,---,t. 

Although Theorem 10 and Corollaries 1-4 are more than sufficient for the 
consideration of the distribution of the generalized sum of squares, they do 
not yield the distribution of the generalized correlation ratio and related sta- 
tistics. It is for this reason that the following definitions, theorem, and corol- 
laries are introduced. 

Let 


where v runs from m+ --- +,1+1 through m+ ---+m,, and let icy) 
and a{,), be defined as are a; and a‘ but with a;;, replacing a,;. Let eis Q4(y); 


Aig 
@g-1,1°* * Gg-1,¢9 
ai * dig 
Kor 
@g-1,1 
ar 
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and a‘) be similar functions of the y,,. Then there exist linear functions 
Wig) Such that Corollary 3 is true for each value of y. 
Let s be a positive integer not greater than #, and let 


Ci Ji,s+1 * Cip 


= ‘= 1, 


Let be the cofactor of oj, in 7; divided by 7:, (7, k=i, s+1,---, 
i=1,---,5). Let ro be the cofactor of o;; in 7;, and let ro** be the cofactor of 
jx in To divided by 70. Let d be the same function of the a;; that 70 is of the o;;. 


THEOREM 11. There exist p linear forms, for a preassigned value of v, 

(3.11) Ziv = Xi + 

= 
such that 

P 

i,j=1 1,j=s+1 

The proof is omitted. It is noted that the transformation (2.36) has unit 
determinant. 

R. A. Fisher* has evaluated the difference between the sum of squares of 
the deviations of y; from the sample regression function in 1, - - - , yi, and 
the sum of squares of the deviations of y; from the sample regression function 
in yi, +, (k<i—1). In order to state his result in terms of the functions 
which are generally computed in least squares solutions, we define the i—k—1 
rowed determinant 


a’ =| jt=k+1,---,é-1, 
and let aj, be the cofactor of a’/*:*-! in a’ divided by a’. Then we may state 
Fisher’s result as the following corollary: 

Corottary 1. If a’‘‘ is the sum of squares of the deviations of y; from tts 
sample regression function in 1, ~~~ , Vx, then the difference, a’*‘® —a’*, is, for 


almost all sets of fixed values of yu, -- - , Yi-1,n, @ positive definite quadratic form 
im Vi, Vin Of rank i—k—1: 


i-1 
bi 


tek+1 


* See [16], §29.1. 
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The proof is omitted. A simple proof may be given by means of Theorem 
11 and the Jacobi ratio theorem* in determinants. 

We now use Corollary 1 to determine the change in the rank of a‘ induced 
by the omission of several observations. 

Let us define, for each value of t, (¢=1,---, m—m), a variable which 
assumes only the values zero and one, 


Vitty = Oy 
Then aj) is the sum of squares of the deviations of y; from its sample regres- 
sion function in yi, , Vi-1, Vien—ny 

CorOLiary 2. The difference 

— 
is, for fixed i and almost all sets of fixed values of Yu, - ++ , Yi-1.n, @ Quadratic 
form in Ya, Vin of rank n—1,.F 

The proof is omitted. 

The following corollary is useful in determining whether or not a set of 
regression coefficients differ significantly from one another. 

Let a/’ be the generalized sum of squares, the elements of which are 

Coro.iary 3. For any fixed value of i, and for almost all sets of fixed values 
of Yu, Vi-tny the function v; is a quadratic form in , Vin of rank 
(m—1)(i—1) and is equal to 

(y) 
(3.12) XX — big (bin — Din 
oh 

The proof is omitted.§ 

Let n’=m+ -- + +m, and let u;=a’'—a’’‘, where /aj’ ,. Then 
we can state the following theorem: 


THEOREM 12. Let 
Xp) = N((x); 
and let 


* Turnbull [32], p. 77. 

¢ The formula obtained by substituting in (3.11) does not provide a simple evaluation of the 
difference. 

t The functions a ) are the regression coefficients in the yth set of observations; in other words, 
) is defined as is bj, but with replacing a,;. 

§ Certain special cases of (3.12) have been used by Welch [33] and Kolodziejezyk [22]. 
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Then if n>n’', the chance variables 
(3.13) ay, Win Vir Us 
are independently distributed, and have joint probability density 
{ot n — n', o*)-G(vi; (m — — 1), o°) 
(3.14) 


If n=n’, the chance variables a(s), wi,, and v; are independently distributed 
and have a joint probability density which may be obtained from (3.14) by sup- 
pressing the terms involving u; and 2. 

Proof. The chance variables (3.13) have been shown, in the corollaries to 
Theorems 10 and 11, to satisfy the conditions of Theorem 8. The proof is then 
completed by means of Theorem 9.* 

The following corollaries are devoted to a consideration of the densities, 
moments and possible ranges of certain chance variables. Applications of the 
distributions are indicated. 


Coro.iary 1. Under the hypotheses of Theorem 12, it follows that 
D(a‘) = G(ai;n — i+ 1, ¢%), 


and 


= IT DP(a'). 


Furthermore E {TJ ,(a‘)!?} =] n—i+1, and 0<ai<o. 


Proof. See Theorem 12 and Theorem 10, Corollary 2. 

Since a;=a' - - - ai, there may be deduced from Corollary 1 the distri- 
bution and moments of the generalized variance,} the ratio of the generalized 
variance to any of its principal minors,{ and the quotient of any two of its 
principal minors. There may also be deduced the joint distribution and mo- 
ments of the generalized variance and some of its principal minors. 

* This distribution is a generalization of that obtained by Bartlett [1], p. 268, starting from 
Wishart’s distribution. It may be remarked that the use of Theorem 6 in conjunction with Theorem 
12 yields a similar generalization of the results of Bartlett’s §2. 

t See Wilks [34], pp. 477, 481. 


t In the general case one has to transform D(a’, - - - , a) to find the probability that a ratio of 
products of the variables a‘ exceeds a given quantity. 


482 W. G. MADOW [November 


In the theory of relations between two sets of variables,* there occurs a 
statistic which measures the portion of the generalized variance of the first 
set of variables which is due to the variables of the second set. 

Let ¢;,=%;,—2;,, and assume &;, to be the same function of the a;; and xj, 
that ¢;, is of the o;; and x;,. Furthermore, let 


= Do (tw — (Xp — Fy), 


sin = (Ei Sin) (Ein Si), 
and let 
= 


(3.16) 


It is well known that gi;=gin+gi and a;;=di,+4;;2. Obviously 
= 


We shall define the residual variance in terms of what may be called the 
generalized covariances. Set s<p—s, and lett 


Wie = (— 


Then we note that from Theorem 11 and Theorem 8, the statistic 


* * 


°° 


is the generalized variance of p—s observations of a vector variate which has 
s—j+1 components and hence has the distribution and moments of such a 


* See Hotelling [20]. A knowledge of §§1—4 of that paper is assumed. 
+ By Wj, we shall mean the same function of the a;; that Wj, is of the gs;. It is noted that W1, is 
the numerator of the statistic g? defined by Hotelling [20], p. 333. 


’ W ie 
By; = —= ‘re 
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variance.* From Theorems 11 and 12 it follows that the parametric function 
estimated by is 
Assume k >7. Since 
Wi Bir 
= 
ie Bi 
its distribution may readily be found in accordance with the preceding dis- 
cussion of the distribution of the ratio of a generalized variance to one of its 
principal minors. 
Since Bj, and d are independent chance variables, the chance variable 


W;, is a product of independent variances and has the distribution and 
moments of such a product.f 


Coroiiary 2. (a) If n>n’, the chance variables at,), ai have density 
a) — aim; — + (j — 1)(m 1), 0’) 
(3.17) 
Furthermore 
] 


7 


(3.18) 


and 
(3.19) 0 < ay) <a’ — aim) — — = 0. 


(b) If n=n’, the chance variables 


j 
* The residual variance is defined to be Bj,= W:,/d, and has the distribution of Bi, if and only 
if gijo= (¢,7=1,---, s). 
+ If s=p—s=2, the statistic Wiz is the square of the covariance tetrad difference. Hence if the 
two sets are independent, the distribution and moments of the covariance tetrad difference have 
been derived. It is noted that the variables of either set may be intercorrelated. The correlation 


tetrad difference will be discussed after Corollary 3. This analysis generalizes the discussions of 
Wilks [35], and Wishart [40]. 


1 
Aly)» ye l,---,w-i, 


484 W. G. MADOW [November 


have a joint probability density, joint moments, and possible ranges which are of 
the same form as (3.17), (3.18), and (3.19) with the obvious differences caused by 
the omission of dim). 


Proof. The corollary is an immediate result of the theorem on the trans- 
formation of multiple integrals. 

If n>n’, the joint density and moments of the chance variables aj), a;, 
(j=1, - - - ,i), may readily be obtained by transforming (3.17). The jacobian 
of the transformation 


1 i 
= Wy) ** * Ay) 


a; 


-1 
II ag | 1 
and the possible ranges of the chance variables are 


j=1 \ @ j-1(7+1) 
(3.20) 

l/k k 


It is noted that if m =n’, one may find the joint density of the chance variables 


Al(y), 


Aj(y), Aj; 2, 


in a similar fashion. The moments of the chance variables are most easily ob- 
tained by means of Lemma 1 and (3.18). 
The inequalities (3.20) are consequences of the fact that 


k 
< ancy) < — aim) — 
Jul 
and of the application of the procedure of finding the extrema of a function 
subject to several conditions. 
From these distributions may be obtained the joint distribution of 


* * » Bim), 


by integrating the other variables over their possible ranges as stated in 
(3.20). 
Let ¢,) =a},)/ai. Then we may state the following corollary: 


m=a---ai 
y¥=1,---,#-1, 
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Corotiary 3. (a). If n>n’, then 


D(ca, C(m)) 


i i i t 
* 5 C(m),@ )=D(a ) Dew, C(m)); 


(3.21) 


1 P 1 P i i i 
) = I[Dew,---: »C(m), @). 
i 


Furthermore 
io 


= [] , Ximj m—i+1, tm—i+1, n—n'+(m—1)(i—1)), 


i i i 
0 < < 1 — — 


(b) If n=n’, the chance variables 
1 
i 
Cy)» j=2,---, 
have a joint probability density, joint moments, and ranges which may be deter- 
mined as above. 


We shall call any statistic of the form 


aicy) 
Ci(y) = 
a; 


a generalized correlation ratio. Then, since 


1 i 
City) = Coy) * 


we may obtain the distribution and moments of the generalized correlation 
ratio* and of the generalized Neyman-Pearson criteria.f 
Applications of Corollary 3 occur in the theories of generalized periodo- 


* See Wilks [34], p. 482. 
t See Pearson and Wilks [29], and Wilks [34], p. 488. These criteria turn out to be functions 
of products of generalized correlation ratios. 


and 
and 
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gram analysis and relations between sets of variables. Of these, we first dis- 
cuss generalized periodogram analysis.* 
Let 


k 
Ziy = Xin — >, 


j=0 
and let the functions ¢o(#), - - - , dx(#) be such that 

= i,j =0,---,k. 
It is well known that if 


D(zv, » = (c)), D(zu, Z pn) Il D(z, 


and if 


Vin = Dy dultr) xiv, 


then 
— Qi, = N((y, — ay); (c)), = 0, k, 
and 
k 
u=0 


Furthermore there exist linear functions 


Yin = bultr) Xin, p= k+i,---,#—1, 
such that 
D(yu; = N((yu); (c)), 


n—1 


p=k+1 


andf 
Dlyio — a0, * = D(yio — ano, * — * 


We consider our sample to contain an odd number{ n, (m =2n’ +1), of ob- 
servations, and let 


* See Fisher [14], and Greenstein [17], for discussions of the Schuster periodogram in one 
variable. 

t This follows from Corollary 2 of Theorem 7. 

t The adjustment to be made if n’ is even will be obvious. See Fisher [14], p. 60. 
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do(t,) = y-1/2, 


2 1/2 
$2;(t,) (=) si 
2 1/2 
$2;-1(t) = (=) 


If k<n—1, it may be shown that a Neyman-Pearson \ criterion* for the 
hypothesis 


aij = 0, 
is given by a generalized correlation ratio 


n—1 
iv 


n—1 
iv 


v=1 


However, if k=n—1, or if it is desired that the hypothesis be rejected if 
and only if at least one of the statistics 


n—1 
VivV iv — Vi,2y-1Vi,2y-1 — Vi,2yVi,27 


(3.22) = 


n—1 
Viv jv 


v=l 


differs significantly from one, some other approach to the problem of testing 
the hypothesis that no period exists must be given. 

Inasmuch as the joint distributions of statistics such as ¢,2,) is quite com- 
plicated,+ we shall extend Fisher’s distribution{ only to the case p=2, and 
shall use, in doing this, not the statistics (3.22) but the statistics 


| + | 


n—1 
jv 


v=l 


(3.23) Cray = 


* See Neyman and Pearson [27], and Wilks [34], p. 485. 

t They may be found by means of the distributions of Corollary 3 and variables having a 
normal multivariate distribution. Some notion of the complications may be gained by examining the 
distribution for p=1. 

t R. A. Fisher [14], p. 57. The solution of a similar problem, if the variance and covariance 
parameters are known, is carried through in the same way. 
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n 
2avj 
n 
i,j=1,--:,p. 
| | 
= 
= 
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The joint probability density of a subset cay), Cay), * » Of the set 
(3.23), is, from (3.21), 


T'(n’)T(n’ — 1/2) 
— — (k + II 


II cers) 


62 (27%) €2(271) 


= 
C2 (245) n'—(k+1) /2—1 
(245) 


C1(2y1) C1 (27%) 


while the joint density of the entire set is 


€2(2) 


€2(2n’) 


1 — C12) — — 
C2(2n'—2) C2(2) (n’—1) /2-1 
) dcic2) dC1(2n’~2) 
€1(2n’—2) ¢1(2) 
If we denote the probability that 

C2(27,) > 9455 = 1, k, 
for each selection of & integers from 1, 2,---, m’, (k=1, 2,---, n’) by 
P(6,,, °° + , 9,), then the probability that at least 7 of the events 

> Oy, 


occur is* 


n'—j 


(3.24) P? = (-1 Sis 


where 


and the summation is for all selections of 7+» integers from 1, - - - , 2. 
From (3.24), we can find the probability that the maximum of the sta- 

tistics c2@,) exceeds a given quantity g, by letting 7=1 and --- =0,-=g. 

It is, however, a question whether only the maximum need be considered or 


* See, for example, [25], p. 162. : 


| 
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whether each statistic should first be tested and then, if / are significant, the 
probability that the / greatest of the statistics should exceed a given quantity 
g should be found. 

We note that by using the corollaries to Theorem 13, other distributions 
which may be used to test hypotheses in generalized periodogram analysis 
may be obtained for p>2. However we shall not examine these here. 

We now derive the joint distribution of g? and z* and the joint distribu- 
tion of the two sample canonical correlations, { which exist if s=2, when the 
two sets are not completely independent. 

The degree of independence which we admit is expressed by the assump- 


tion that there exist s linear functions 1,, --- , Ve» Of , and p—s 
linear functions , Vp» Of * » Xp», Such that the probability 
density of the chance variables 
(3.25) 
is 
1 2 
2(1— p*) > i 
We first note that if p =0, the joint probability density of 
Wi, 
qd = = 
a.d 
and 
ap 4s(1) 
= 
a.d as 
wheref 
asi) = | 


may be obtained from Theorem 12, Corollary 3, since both g? and z are in- 
variant under the transformation, which we have assumed, carrying the 
chance variables x;, into the chance variables (3.25), (op =0). 

It follows from Theorem 9 and Corollary 3, that if we denote d12/au by 
R?, then the chance variables 


* Hotelling ([20], p. 333) defined gq? and obtained on page 354 its distribution when one non- 
vanishing canonical correlation parameter exists. The distribution of z in that case may readily be 
obtained by the same procedure as that by which we shall obtain the joint distribution of g? and z. 

t Hotelling (|20], p. 375) derived this distribution assuming s=p—s=2 and complete inde- 
pendence of sets. 

t a;j, and a,j are defined in (3.16). 


£ 
t 
| 
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2 i 
R , Ca), Cc, i=1,---,s, 
have density 


— j + 1)/2) 


I — s — + 1)/2)-T((n — p +s — + 1)/2)-T(G — 1)/2) 


j=2 
where F(n/2, n/2, (p—s)/2, p?R?) is the hypergeometric function.* If s=2, 
then = R? (1 and 
T'(m)(1 — p?)”/? 


D = (n—pt+1) /2—1( (p—3) /2—1 


(3.27) J F(n/2, n/2, (b — 2)/2, p*[a(1 — z — q*) + 


—1 
| —x)+ dx. 
i-s-¢ 
Then the general distribution may be similarly obtained as a multiple 
integral. The joint moments of gq? and z are obtainable directly from (3.26). 
Another statistic which may occasionally be of interest is 
2 
pea 
the distribution of which may be obtained as above. The factors of D are 
ratios of independent variances rather than correlation ratios if the sets are 
independently distributed. 
The joint distribution of the two sample canonical correlation coefficients 
may be evaluated by means of the relations 


= rer? 


z= (1—r?)(1 — r?). 


Since 0(q?, 7?) =r? —r?, if r? it follows that the chance 
variables r? and r,? have a density which may be obtained by transforming 
(3.27). 


* The details consist of letting ¢= noting that 
-(1—c/an)"?, using Corollary 3 and Theorem 9, and then employing the same procedure as that 
adopted by R. A. Fisher [13]. 

+ Hotelling [19], p. 335. 
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The first measure of relations between two sets of variables, both of which 
contained more than one variable, was the tetrad difference. However, as will 
be shown, the probability density of the tetrad difference depends on the 
“internal” correlation parameters even when the two sets are assumed to be 
independent. It is this fact which makes it necessary that q? be used instead 
of the tetrad difference unless the correlation parameters are known. 


To discuss the statistic 


Wis 
(3.28) Ty = , 


°** App 


which is a generalization of the tetrad difference, we first note that if all the 
variables are independent, or if we consider the statistic 


Wie 
811 22° Spp 


Tie 


in either of these two cases, the statistic is a product of independent de- 
terminants of correlation coefficients and has the distribution and moments 
of such a product.* 

We now assume that s = p—s=2 and derive the distribution of the square 
of the correlation tetrad difference under the assumption that the two varia- 
bles of each set are correlated but that the two sets are independent. 

Let the chance variables x;,, (i =1, 2, 3,4; v=1, - - - ,), have probability 
densityt 


2 .—n/2 


-exp | - > — 2pieXivXe + 


2(1 — > 
1 2 2 
Now by (3.28), 
2 
11 de 


* If all the variables are independent, then the p-rowed determinant of correlation coefficients 
is the product of p—1 independent multiple correlation coefficients and has the corresponding dis- 
tribution and moments. Wilks, [34], p. 492, first derived this distribution by the moment method. 
Certain more general distributions, which do not depend on the vanishing of all the correlation 
parameters may easily be derived. 

The assumption o4;=1, (i=1, 4), involves no loss of generality since 7), is invariant 
under the transformation iy. 


} 
4 
j 
i 
i 
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Then R?, Vi, and V2 are independent, and 


aI'((n — 3)/2)T((m — 2)/2)T((n — 1)/2) 
— F(n/2, n/2, 3/2, pi2R?) 
— n/2, 1/2, paV2)- 


D(R’, Vi, V2) = (1 — 


From this may be found at once the moments and distribution of T12. The 
distribution and moments of 7;, may also be found. 

Let 7_ and let b{,) be defined as are and ai,. Let 
bij, and so on, be similar functions of the variables y,,. Assume that b/ =0, 
and let 

= Bey — 

We shall assume that »=,+ - - - +». This involves no loss of general- 
ity, for if 7>m+ --- +, we may extend the system b/;, to include Di jm41. 

THEOREM 13. Let D(x,, , xp») =N((x,); (o)), and let D(xu, , Xpn) 
=|],D(x1,, --- , Xp»). Then 

= m — i+ 1, 
Duin) = My, 
Dwi.) = G(wis 1,0), 


D(u,w?) = Diu) | | pw, |. 


Proof. See Theorem 11. 


COROLLARY 1. 


(3.29) D(bay, Bem) =G(bay3 m — + GG) — 0’) 
y=2 


and 
1 Pp 1 7 
Dba, Bem) = [] D@a, , 
i 


Furthermore 


492 
Let 
2 
a a 
ay ae22 
and 
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II II (bi) | 


(3.30) 


= (a. > (ms + —F 


7 


The proof is omitted. 

Let dj, =0,)/bim). The probability density and joint moments of the 
chance variables dj,, may be obtained by transforming (3.29) and (3.30). 
Furthermore 


0 < day < <1. 
Let w{,) =b/,)/b{,41». We can then state the following corollary: 


COROLLARY 2. 
= 11 + ty — +1, 


1 m—1 
Day, = I] [I] 
y=l 
and 


The proof is omitted. The moments and possible ranges of the chance vari- 
ables w},) are too well known to require statement here. 

We are now able to derive several additional distributions of the general- 
ized analysis of variance. 

To do so, we need only note that 


1 Pp 1 p 1 Pp 1 


t 
= ben, 


and that the probability density and moments of the statistics b;,) may 
therefore be obtained from Theorem 13, Corollary 1. 

The probability density and moments of the statistics dj), where 

b; 
i(y) 
i(m) 

may be similarly derived. 

It may be remarked that the independence of the chance variables w/,) 
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and 
t 
| 

if 

j 
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permits the distributions depending on them to be expressed in terms of pre- 
viously obtained distributions. 

Distributions derived from Theorem 13 may be used in testing various 
hypotheses in generalized analysis of variance, comparative analysis,* and 
relations between sets of variables. The method to be used should be suffi- 
ciently clear to make any further transformations here unnecessary. 
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A CONTINUOUS FUNCTION WITH NO UNILATERAL 
DERIVATIVES* 


BY 
ANTHONY P. MORSE 


1. INTRODUCTION 


A. S. Besicovitch has given a construction of an even continuous function 
B for which he asserts the property D,B(x) <D+B(x), (—1<x<1), and, in 
consequence of the evenness of B, the companion property D_B(x) <D-B(z), 
(—1<x<1); that is, B has nowhere a unilateral derivative finite or infinite. 
Later E. D. Peppert examined this same function. Some readers have found 
difficulty in following the reasoning employed by both these authors, and it 
may be that in some minds doubt as to the existence of such a function has 
been raised by the theorem of S. Saks§ to the effect that the “functions of 
Besicovitch” constitute a set of only first category in the space C of continuous 
functions. || 

In the present paper we, like Besicovitch, associate with a function having 
dense intervals of constancy another such function. The method of associa- 
tion used here, however, is purely arithmetic and differs essentially from that 
of Besicovitch. The function F which it is our purpose to define and investi- 


gate is even and continuous on the open interval (—1, 1) and has the property 


F(é) —F 


lim 


which, incidentally, the function B does not possess. 

We remark that it is our intention to order the subsequent material so as 
to facilitate the reading of the proof of Theorem 5.2. Should the reader wish 
to acquaint himself at once with the formal definition of F, he may do so by 
examining §2 and Definitions 3.1, 3.3, 4.2, and 5.1. 


* Presented to the Society, December 30, 1937; received by the editors December 18, 1937. 
The author wishes to express his gratitude to E. M. Beesley of Brown University for assistance in 
preparation of the manuscript and for suggestions concerning presentation of the material. 

Tt Besicovitch, Discussion der stetigen Funktionen im Zusammenhang mit der Frage iiber ihre 
Differentiierbarkeit, Bulletin de l’ Académie des Sciences de Russie, vol. 19 (1925), pp. 527-540. 

t Pepper, On continuous functions without a derivative, Fundamenta Mathematicae, vol. 12 
(1928), pp. 244-253. 

§ Saks, On the functions of Besicovitch in the space of continuous functions, Fundamenta Mathe- 
maticae, vol. 19 (1932), pp. 211-219. 

|| It may be recalled that Banach, Uber die Baire’sche Kategorie gewisser Funktionenmengen, 
Studia Mathematica, vol. 3 (1931), p. 174, has shown that in C the set of functions having at no 
point a finite unilateral derivative is of second category. 
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A CONTINUOUS FUNCTION 


2. NOTATION AND CONVENTIONS 

We employ the symbol [x x2] to denote a closed interval, the symbol 
(a1, x2) to denote an open interval. The latter notation will be understood to 
imply that x,<a2. By the word set we understand a set of real numbers, by 
the word function a function on a set to a set. The letters m, N, v are permitted 
to assume integer values only. We denote the outer Lebesgue measure of a 
set A by |A|. 

We shall find it convenient to refer to an open interval (a, 8) as an interval 
of aset A if (a, 8) ¢ A with a and 6 elements of the closure of the complement 
of A. In case A is open this means a and £ are not elements of A. 

In connection with a function f we employ the notations: 


K(f) = the interior of the set E | lim [f(a + h) — f(x)|/h = 0], 


H(f) = the complement of K(f) with respect to the domain of f, 
Z(f) = Elf(x) =0], = Elf(x) > 0). 


We also agree that if A is a set and r=0, then the set A™ shall be defined 
as follows: x e A® if x=+1 or if there is an interval (a, b) such that 
xe (a, b)cA with b—a>r. 

3. THE SEQUENCE {\,} AND THE FUNCTION 0 


3.1. Derrnirion. {A,} is the sequence, defined on all integers n, for which 
An = 1/2 + n/2(| n| + 3). 


In addition to the fact that \,——1 or 0 according as n+ © or —, the 
properties of {X,} for which we shall have use are embodied in the following 
lemma: 

3.2. LemMMA. For each integer n, Xn =1 and 0 <A? <An. 

Proof. The first relation being evident, we prove the second. For each in- 
teger n, |n|(m+1)—|n+1|n=0, and we have 


m= n+1 n ) 
es 


(Jn +1|+3)(|»| + 3) 
3 ( 3 ) 2(| »| + 3) 


~ +3) \2[n|+3)/ 3([n+1| +3) 
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|n{+3)/ +1] +3) 


[n+1|+3 


S (An)?(2/3)(1 + 1/3) = (An)?(8/9) < (rn)? < An. 


Since \,, converges to 1 or 0 according as n+ or —, and since 
0<A,n<An4i<1 for each integer (see 3.2), it follows that there is a unique 
function 6 determined by the following definition: 


3.3. DEFINITION. With y,=X, for n odd and y,=(An)'!? for n even and 
a particular non-dense closed set which enjoys the property* 


| [An — A, > 0, | Pn An 2>0, <n<om, 


6 is the function on (0, 2) defined by the following relations: 
6(x) = (Yn41 Yn) | [Ans /| [rns Anti] An 
a(1) = 1, = 60(2 — x), 0< 2 < 2. 


3.4. THEOREM. The function 0 has the following properties: 
(a) 0 is continuous on (0, 2) and 


lim 0@(&) = 0 < x/2 S O(x) S (x + 3)/4 S @(1) = 1, 


(b) K(0)t is dense in (0, 2), while X,, e H(0) for each integer n; 
(c) lim | — 2) 2)| < «<2; 
(d) if (a, B) is an interval of K(0), then 0[(a+8)/2]=(B—a)*”. 
Proof. In order to establish these properties we note first that for each 
integer 6 is clearly continuous on \,4:] and that 
(1) An S A(x) S (Angi)"”, An S S Nati. 
Thus from (1), the symmetry of @, and the equalities 


lim A, = lim (A,)!/? = 1, lim A}, = lim (A,)!/? = 0, 


it follows that @ is continuous on (0, 2) with 0(¢)—0 as £ 0+. Observing that 
the relations 
Anti — An < An, An +A, = 1, n an integer, 


* This is an almost immediate consequence of the existence of non-dense closed sets of positive 
measure. 
+ The meanings of K(6), H(@), and an interval of a set are explained in §2. 
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found in Lemma 3.2 imply the relations 
(2) Angi < 2An, An an integer, 
and recalling that the geometric mean never exceeds the arithmetic, we con- 


clude from (1) that A, S*<,4: implies 


< x/2 S < An S O(x) S = (1 — S (1 — 


2 2 2 2 


whence (a) is established. 

Using the notation of 3.3 for the remainder of the proof, and deferring 
consideration of (b) until last, we begin checking (c) by noting that (1) and 
(2) imply 

| — | = =A, 5 


From this and the symmetry of @ it follows that for «=1 
lim | [0(€) — — < ©; 


for other x e (0, 2), this inequality results from the relation (see 3.3 and note 
that |¥n41—7Yn| <1 for each integer m) 


| 0(x2) — 6(21) | | %2— | /| [Any Anil An S %1, %2 S Any. 


In order to check (d), we prove first that \,, e H(0) for each integer m. If 
is odd, then Yn =An< (Angi)? =Yn41 ANd Yn41—Yn >0, So that the combination 
of this last relation with the property of § stipulated in 3.3 yields the rela- 
tions A, H(@), H(0); whence X,, for each integer This of course 
implies 1 « H(@), so that upon letting (a, 8) be any interval of K(@) we have 
either (a, 8) ¢ (0, 1) or (a, 8) ¢ (1, 2). Thus in order to verify 

+ B]/2) = (8 — a)? 
there is, in view of the symmetry of 0, no loss in generality in assuming 
(a, 8B) ¢ (0, 1). Since X,, e H(6) for each integer n, it is clear that an integer V 


exists for which Ay Sa<6 Sdwy4:, and hence from (1) and Lemma 4.2 follows 
the relation 


+ B]/2) = dw > — Aw)? (8 — 
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The property (d) is now established as well as the second part of (b); the 
first part of (b) is a consequence of the fact that the complement of © is an 
everywhere dense open set. 


4. FUNCTIONS OF CLASS %; THE FUNDAMENTAL OPERATION 


4.1. Derinition. A function f on (—1, 1) is said to belong to the class % 
if it possesses the following properties: 

(a) f is continuous with 0<f(—x) =f(x) for x e (—1, 1); 

(b) K(f) and P(f) are dense in (—1, 1); 

(c) Corresponding to each interval (a, 8) of the open set P(f) there exists a 
number h=([B—a]/2)1!? such that x ¢ (a, B) implies 


f(x) = h-0[(2x — 2a)/(B — a)]. 


In the next definition we exhibit the operation which is fundamental in 
the present development and which, among other things, associates with each 
function in & another such function. 


4.2. DerinitTIon. With each function f on (—1, 1) we associate the function 
f, likewise on (—1, 1), for which x « H(f) implies f(x) =0, and x e K(f) implies 
f(x) =h-6[(2% —2a)/(8—a) |, where (a, B) is the interval of K(f) of which x is 
an element and h is the lesser of the numbers* 


(int {xno — sup 1, 


and 


| + 8)/2)| 


21/2 


We devote 4.3-4.9 to the study of this operation. 
4.3. Lema. If f ¢ U and (a, B) is an interval of K(f) with (a+8)/2=un, 
then f(u) = (8 —a)"?. 


Proof. The satisfaction of 4.1(b) by f implies K(f) ¢P(f) which in turn 
implies the existence of an interval (a’, 8’) of P(f) for which a’ <a<B<f’. 
Now let 


and observe that the relation (see 4.1(c)) 


* By K(f)*®™ we understand [X(f)]®-™. The latter notation is explained in §2. 


a’ + 2a — 2a’ = 28 — 2a’ + 
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f(x) = f(u')o[(2x — 2a’)/(8’ — a’)], al 
implies (a’’, 8’’) is an interval of K(@). Making use of 3.4(d), 4.1(c), and the 
above relations we conclude 

f(u) = f(v')o[(2u — 2a")/(B’ — = = f(u’)(B" — 
= {[@’ — a’)/2]-(6" — a”) 
= {[(@’ — a’)/2]-2(6 — a)/(6’ — a’)} = — a)". 
4.4. Lemna. f U and (a, B) is an interval of K(f) with (a+ =p, then 
F(u) = [(8—a)/2]'” and x e (a, B) implies 
0 <f(— x) = f(x) = fwe[(2x — 2a)/(8 — flu) S f(x)/2"”. 
Proof. Letting 
ao=sup {K(f)¢#-[—1, Bo=inf {K(f)®- [u, 1]} 


so that f(u) is the lesser of the numbers f(u)/2"? and (8o—ao)"?, we note 
that the evenness of f implies (—8, —q) is an interval of K(f) and implies the 
equivalence of the relations x e K(f)®-®, —x e K(f)®-~ which in turn imply 


— Bo = sup {K(f)*)-[— 1, — — a0 = inf {K(f)*-[— 1]}. 


Consequently, in view of the relation /(u) =/(—,), it follows that f(u) =f(—n), 
which in conjunction with 4.2 and the symmetry of @ assures us that 
f(x) =f(—2x) for x e (a, 8). Furthermore the relations 


(Bo — ao)? (8 — a)? > [(8 — a)/2]"*,  f(u)/2"? = [(6 — «)/2]", 

the latter of which is a consequence of 4.3, entail 
f(u) = [(6 — a)/2] > 0. 

From this last relation, 4.2, and 3.4(a) it follows that x e (a, 8) implies 

0 < f(x) = f(u)-0[(2" — 2a)/(6 — a)] < flu) S f(u)/2"!? = f(x)/2"?, 
and the proof of the lemma is complete. 

4.5. Lemma. [f f then 

0S f(— x) = f(x) f(x)/2"”, 


Proof. Since f is even, x e H(f) implies —x e H(f). Hence from 4.2 we con- 
clude that for x e H(f) 


0 = f(— x) = f(x) S f(x)/2"”, 


and the preceding lemma completes the proof. 
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4.6. Lemma. If f e then K(f) =P(f), H(f) =Z(f). 


Proof. Lemma 4.4 implies K(f) ¢ P(f). Definition 4.2 implies H(f) ¢ Z(f). 
Lemma 4.5 implies that P(f) and Z(f) are complementary in (—1, 1). The 
proof is complete. 


4.7. Lemma. If f ¢ & and (a, B) is an interval of K(f), then a and (a+8)/2 
are elements of H(f) and we have 


Fla) = tim FQ) = timing —2™ him sup 


Proof. Referring to 4.4 we note 
= — 2a)/(8 — a)], 


where h=f[(a+8)/2]>0; so that upon noting ae H(f) and referring to 
3.4(a), 4.2 we conclude that f(é)—f(a)=0 as ~-a+. Also upon defining 
£,=a+X,(8—a)/2 and recalling that \, e H(@) for each integer n we con- 
clude that &, e H(f) for each integer n and (see 3.3) 

— An B-a 

= = ° 
&—a An B-a@ An 


nodd,n—— ©, 


n even, — ©, 


Since £,—a or (a+)/2 according as n—— © or +, it is clear that a and 
(a+ )/2 are elements of H(f) and that the proof of the lemma is complete. 


4.8. THEoreM. f then x e H(f)-P(f) implies 
Pek — f(x) “ 


— f(x) 
up « 


fort 
teH(s) teH(7) 


lim f() = f(x) = 0 S lim in lim s 


Proof. In view of the preceding lemma we confine ourselves to the case in 
which x not only is an element of H(f)-P(f) but also a cluster point of 
[x, 1 ] P H(f). 

Let {inf K(f): [t, 1]—sup K(f)-[—1, ¢]}, and note 
that the openness of K(f) implies (—1, 1)-K(f)® 1 K(f) as r0+ so that the 
openness and denseness of K(f) in (—1, 1) imply* M(r)-0 as r-0+. Thus if 
{é,} is a sequence of points of K(f) tending to « from the right and if the 

* Let e>0; let —1=ap<a< +++ <ay=1 with a,—a,1<e/2 for »=1, 2, - - - , N. Since there 


exists a positive number 6 such that K(f) [a,_1, a,] is non-void for »=1, 2, - - - , N, we conclude 
that 0<r<é implies M(r)<e. 
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interval of K(f) of which é, is an element is denoted by (an, Bn), it becomes 
clear in the light of 4.5, 4.4, and 4.2 that 


0= F(x) < flén) < f[(an + Bn)/2] 4 [M(6, an) + 0 as 


Since f(x) =f(£)=0 for & e H(f), it is now evident that f(£)—f(x) =0 as 
Nowasequence { (an, b,)} of intervals of K(f) with m,=(a,+b,)/2 
exists such that m,—x-+ and such that for each positive integer m the set 
[x, m,]-K(f)%=- is void and* the number [2M (6, —a,) |"? is less than f(m,). 
Recalling 4.2 we have ' 


F(m,) [M(b, Gn) < f(m,)/2"?, = 1, 2, 3, 
which with the fact that [x, m,]- K(f) is void for n=1, 2, 3, - - - implies 
= {inf 1]- K(f)» — sup [— 1, } 1/2 
> (m, — x)'2, m= 1,2,3,---. 
The Theorem is now a consequence of 4.5, the first conclusion of 4.7, and the 
following two relations which are valid for m=1, 2,3,---: 
F(m,)/(m, — x) = 1/(m, — x)"?, Flan) F(x) = 0. 
4.9, THEoreEM. If f thenf e %. 


Proof. That f is continuous on the right at each point of H(f)-P(f) is a 
consequence of 4.8; that the same is true on H(f)-Z(f) and on K(f) is a con- 
sequence of 4.5 in the first case, and of 4.4, 3.4(a) in the second; that f satis- 
fies 4.1(a) is now a consequence of 4.5. 

Since K(6) is dense in (0, 2) it follows from 4.4 that K(f) is dense in K(f) 
which is dense in (—1, 1). Lemma 4.6 implies K(f) = P(f) which of course as- 
sures us that P(f) is dense in (—1, 1) and that f exhibits property 4.1(b). 

Lemma 4.4 combines with 4.6 to yield the satisfaction by f of 4.1(c). 

The proof is now complete. 


5. THE FUNCTION F 
5.1. Dermition. {F,} is the sequence of functions on (—1, 1) determined 
by 
Fo(x) = 0(« +1) for 1, Fa 41 = F, for = 0,1,2,---. 
F is the function defined by the relation 


F= (-1)*,. 


n=0 


* Here we use the continuity of f, the inequality f(x) >0, and the fact that M(r)-30 as r>0+. 
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We are now prepared to establish the following theorem: 


5.2. THEOREM. F is even and continuous on (—1, 1) with 


. | — F(a) — F(x) 
lim inf |_| < lim sup | ——| = ~ , 


and, in particular ,* 


0, xe P(F,). 
Proof. Denoting by Sn, F—Sn by Rn, P(Fn) by Ba, Z(Fn) 
by by by 3, and checking that Fy e we deduce from 
4.9 that n=0, 1, 2,--- implies F, e« Y& and hence that F,, is even and con- 
tinuous. That F is even is now obvious; that it is continuous is a consequence 
of the easily verified relations (see 4.5) 


(1) | Ra(x) | S Fagi(x) S Fa(x)/2"!? < 


A consequence of (1) of which considerable implicit use will be made is 
the fact that Fn4:(%) =0, or x e 3n41, implies R,(«) =0. It also follows from 
(1) that %,, and 3,,are complementary sets in (—1, 1) form=0,1,2,---, that 
% and 3 are likewise complementary in (—1, 1), and that Bo¢3.c - 
with 

We divide the remainder of the proof into three parts. 

Part I. Jf x « 3 then 
FO FO) | FO) 


lim inf | < lim su 
fort gE— fort 


Letting x» be any element of 3 we introduce the notation 
= — — 20) 


to facilitate verification of lim inf;..,, | AF(é)| < lim sup;..,, | AF(¢)| 
To this end let NV (21) be the integer for which x e 3v- By-1. Then since 
xo e forO<n<N —1, it follows from4.1(c) and3.4(c) that 
lim |AF,(£)| <2 and hence lim sup;.2,4+ | ASy-1(€)| On the 
other hand ARy(é)=0 for £ e 3yv4:. Moreover, in view of the relations 
Fy_,e U, Fy =Fy-1, and (see 4.6) 

H(F,) = Z(Fn+1), n=0,1,2,---, 


* The set [T>_, P(F,) is a residual set of measure zero. The construction can be modified so that 
this set has measure arbitrarily close to two. 
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it may be seen that use of 4.8 yields 


0 lim inf AFy() < lim sup AFy() = 
N+1 


Thus from the above follow the two relations 
lim inf | AF(g)| < lim inf | AF(¢)| = lim inf | ASy_s(¢) + AFy(é)| 
< lim sup | ASy_1(¢)| + lim inf | AFy(g)| < © 


N41 


and 
lim sup | AF(é)| = lim sup | ASy_s(£) + AFw() | 
+ 
> lim sup | AFw(¢) | — lim sup | ASy_s(¢)| = ~, 


&e3y41 


which in conjunction with the fact that xo is any point of 3 complete the 
proof of Part I. 
Part II. Jf x then 


lim inf | [F(e) — F(x)]/& — x)| = 0. 


Let xo be any point of §, let (an, 8.) be the interval of $,, of which xo 
is an element, let u,=(an+6,)/2. From 4.6 follows 
(an, Bn) = = K(F,), 
which implies 
(2) Fy(un) — F.(x0) = FABn) — = 0, 
S(un) — = S(Bn) — = 0, Osvsn-1. 


Thus for even and positive (note that 8, ¢ 3,, and see inequality (1)) 
F(Bn) F (xo) Sn-1(Bn) Sn—-1(%0) + F,(Bn) — Fa(%0) + R,(Bn) R,(%0) 
(3) = —F,(x0) — Ra(xo) S — Fa(%o) +| Ra(xo) | 
— + = (— 1 + 2-"/*)F, (x) 0, 
whereas for odd and positive 
F(8n) — F(2%0) = Fn(20) — Rn(%0) Fn(%0) — | Rn(x0) | 2 — 2-1/2) = O. 


These last two relations coupled with the continuity of F imply the existence 
of {£,} for which 
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F(é,) — F(%0) = 0, S S Bn, m= 1,2,3,---. 

Hence in view of the relation (valid for m=1, 2, 3, - - - ) 
(4) 0 < — %0 S Ban — S Bn — S 2[F a(un)]? S asn— 


which is a consequence of 4.1(c), the equality 9(1)=1, and inequality (1), 
the remainder of the proof of Part IT is evident. 
. Part III. If x « then 


lim sup | [F(@) — = 
Let xo be any element of §, and let a,, 8,, and uw, be as defined in Part II. 


By reasoning to be given later we have, in case a, <%o <u, with m even and 
positive, 


| F(un) — | = | Fa(uin) — — | — — 
2x9 2an 
=> F,,(un) — = Falun) ~ 


n— Qn 


( Mn — Xo ) Mn — Xo 
2 2Bn — 2an 23/2(B, — 


> — x9) S /2(y, — xo), 


whereas in case <8, with m even and positive we have 


| F(Bn) — F(x0) | = | Fa(xo) + Ra(xo) | Fa(ao) — | Rn(o) | 
> F,(o)(1 — 2-1/2) 


2an 
= 2-°F = (un): 


n — Qn 


28, — 2x 
> 2-°F (un) (- 


n — 
(= - 5-6/2 (8, — %o) 
=> 2(n-6)/2. (8, Xo). 


Consequently { £, } exists for which OT Bon, (w=1, 2, 3, -- - ), with 
the result that 
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as 


— F(%o) 


fn — Xo 


and the proof of the theorem will be complete as soon as we verify (5) and (6). 

We observe that (5) consists of ten steps the reasons for which we now 
give in order. (i) This is a consequence of relation (2) of Part II and the fact 
that 4.7 and 4.6 imply u, ¢ H(F,) =Z(F 41) =3n41. (ii) Arithmetic. (iii) In an 
alternating series of the type considered here the remainder term has the 
same sign as the first term omitted. Thus (—1)"t!F,4:(%o) and R,(xo) are 
non-positive since m is even and positive. (iv) This is a consequence of 4.1(c) 
and 6(1)=1. (v) From 3.4(a) it follows that 0(x)<(x+3)/4 for 0<*#<1. 
(vi) Arithmetic and the definition of u,. (vii) Inequality (4) of Part II. 
(viii) Arithmetic. (ix) Inequality (4) of Part II. (x) Arithmetic. 

‘We treat (6) in a similar manner. (i) Relation (3) of Part II. (ii) An ele- 
mentary inequality. (iii) Relation (1). (iv) Arithmetic. (v) This is a conse- 
quence of 4.1(c) and 6(1)=1. (vi) From 3.3 we have 0@(x)=0(2—x) for 
0<x<2. (vii) From 3.4(a) we have x/2 $0(x) for 0<x<1. (viii) Inequality 
(4) of Part II. (ix) Arithmetic. (x) Inequality (4) of Part IT. 
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A SYSTEM OF ORDINARY LINEAR DIFFERENTIAL 
EQUATIONS WITH TWO-POINT 
BOUNDARY CONDITIONS* 


BY 
WILLIAM T. REID 


1. Introduction. In a recent paper Bliss [2]+ has treated a boundary value 
problem which is definitely self-adjoint according to a weakened form of an 
earlier definition of definite self-adjointness which he gave in a paper [2] 
published in 1926. He has shown that a system which satisfies this modified 
definition has most of the properties proved in the former paper for a system 
satisfying the original and stronger definition of definite self-adjointness. The 
characteristic values for such a problem are all real and have indices equal to 
their multiplicities, and the expansion theorems established in the earlier 
paper are still valid. Such a system, however, will not in general have an 
infinity of characteristic values. 

Bliss has shown that the canonical form of the so-called accessory bound- 
ary value problem for a normal non-singular problem of Bolza in the calculus 
of variations is definitely self-adjoint according to this new definition, while 
such a problem is in general not definitely self-adjoint according to the origi- 
nal definition. Hu [5] has proved that this problem has an infinity of char- 
acteristic values. Various authors (Morse [7] and [8], Reid [9], [10], and 
[11], Hu [5], Holder [4], Birkhoff and Hestenes [1], Wiggin [12]) have 
treated the accessory problem under the assumptions that the problem is 
normal and satisfies the strengthened Clebsch condition. This latter condition 
implies non-singularity. Under these hypotheses the characteristic values 
possess certain extremizing properties. Moreover, for such systems there have 
been established oscillation and comparison theorems which are generaliza- 
tions of the classic results of the Sturmian theory for a single second order 
differential equation (see, for example, Ince [6], chap. 10). It is to be re- 
marked that for the problem treated by the author ([9] and [10]) it is not 
assumed that the coefficients of the terms involving the parameter are those 
of a definite quadratic form. To compensate for this weakened condition it 
it assumed that the quadratic functional of the associated minimum problem 
is definite. This latter hypothesis is no additional restriction for an accessory 
problem which is normal, satisfies the strengthened Clebsch condition, and 


* Presented to the Society, April 11, 1936; received by the editors November 3, 1937. 
t Numerals in square brackets refer to the bibliography at the end of the present paper. 
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for which the coefficients of the terms involving the parameter belong to a 
definite quadratic form. In the author’s paper [9] there are also established 
expansion theorems in terms of the characteristic solutions of the problem. 

It is the purpose of the present paper to consider a boundary problem 
which is definitely self-adjoint according to the new definition of Bliss, and 
which satisfies the additional condition that the matrix of coefficients of the 
terms involving the parameter has constant rank on the interval of definition. 
This latter condition is satisfied in the most important examples of definitely 
self-adjoint systems. It is here shown that such a system is equivalent to a 
boundary value problem associated with the second variation of a calculus of 
variations problem, and of the type previously treated by the author [9]. The 
character of the equivalence is new in the sense that the canonical form of the 
second problem is not the given definitely self-adjoint problem, but rather a 
system involving twice the number of dependent functions occurring in the 
original system. There is obtained a general condition which is both necessary 
and sufficient for the given system to have an infinity of characteristic values. 
There are also given two different sets of conditions which are merely suffi- 
cient for this conclusion. In particular, in §5 it is shown that an accessory 
boundary value problem which is normal and non-singular has an infinity of 
characteristic values. As stated above, this result was first proved by Hu. As 
already noted, oscillation, comparison, and expansion theorems are known for 
a system of the form to which the given definitely self-adjoint problem is here 
shown to be equivalent. From these theorems there are deduced in §6 certain 
corresponding results for the given boundary value problem. 

The notation used for the definitely self-adjoint system here treated is the 
same as that used by Bliss in papers [2] and [3]. Frequent references are 
made to the results concerning differential systems in general, and the theo- 
rems for self-adjoint systems, obtained by Bliss in §§1 and 2 of paper [2]. 
It is to be noted that for the particular definitely self-adjoint problem herein 
treated the proof of the reality of characteristic values is independent of the 
proof of paper [3] of Bliss, since this result is here deduced from the corre- 
sponding theorem for the equivalent problem of the type previously treated 
by the author. 

2. Statement of problem. In the following pages the summation conven- 
tion of tensor analysis will be used. The subscripts i, 7, k will have the range 
1,---,. The matrices || A ;;(x)]] , ||B,;(«)|| will be supposed to have elements 
which are real-valued and continuous on a<x<b, while the elements of the 
matrices || M;;\|, || V;,|| are real constants such that the matrix ||M,;; Ni;|| has 
rank n. For brevity we will introduce the notations* 


* Here, as elsewhere, primes are used to denote derivatives with respect to the variable x. 
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(2.1) Lily] = yf — = + 

The boundary value problem to be considered in this paper may then be 
written as 
(2.2) Lily] =  sily] = Misyi(a) + Nisyi(d) = 0. 

The system adjoint to (2.2) is (see Bliss [2]), 
(2.3) M;[z] = — ti[z] = 2(a) Pie + = 0, 
where p;=Pi;, 9:=Qi;, (7=1,--- , 2), are linearly independent solutions of 
the algebraic equations 

M iipi — Nisgi = 9. 

The boundary value problem (2.2) will be treated under the following 

hypotheses: 


(Hi) The system (2.2) is self-adjoint under the non-singular transforma- 
tion 2; = T;;(x)y;, that is, under this transformation the differential equations 
and boundary conditions of the adjoint system (2.3) are equivalent to those 
of (2.2) for all values of \. The elements 7;;(x) are supposed to be of class C? 
on the interval ab. 


A non-singular-matrix |] 7;,(x)|] of functions of class C! is such a transfor- 


mation matrix if and only if 
(2 4) T + AnT kj + Ti; 0, T + Bul — 0, 
= NuTu)N 


In particular, if «;(x) and 2,;(x) are sets of functions of class C! and are 
related by the equations v,(x) =7;(x)u;(x), we have the identity 
(2.5) M;[v] = Tilx)L;[u]. 
Moreover, t;[v] =0 if and only if s;[u] =0. 


(He) | Si;(x)}| =|| T is symmetric. 
(H;) If (u,) is a set of real constants, then 


= 0. 


(H,) There exists no non-identically vanishing solution of £;[y]=0, 
si[y] =0 such that B;;(x)y;(«) =0 on ab. 
If the system (2.2) is the canonical form of a so-called accessory boundary 


value problem associated with the second variation of a problem of the cal- 
culus of variations, the above hypothesis (H,) is equivalent to the assumption 
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of normality of the accessory problem. The significance of this hypothesis for 
the general self-adjoint system of the form (2.2) has been pointed out by 
Bliss [3]. 

(Hs) ||Bi;(x)|| is of constant rank n—m, (0<m<n), onaSxSb. 


If y:(a), yi(b), z:(a), 2:(b) are arbitrary values, we have (Bliss [2]), 
b 
(2.6) val | = + 


where 5;[y] = Mijy(a) + [2] =2;(a) Pj: +2;(b)0j:, and the matrices 
|| || are chosen so that 


are reciprocals. orca if wa z;(x) are solutions of the differential equa- 
tions of the systems (2.2) and (2.3), respectively, for some value of \, then 
[vi(«)2:(x) |’ =0. Combining these results, and making use of the self-adjoint- 
ness of the system (2.2) under the transformation z;=7,,;y;, we obtain that 
if yi(x) and y*(x) are solutions of (2.2) corresponding to distinct values \ 
and i*, respectively, then (see Bliss [2]) 


(2.7) f yi(x)Si(x)yf(x)dx = 0. 


3. Relation of the boundary value problem to the calculus of variations. 
In this section there will be pointed out the relation of the above boundary 
value problem (2.2) with a problem of the form of an accessory boundary 
problem associated with the calculus of variations. 

As a consequence of hypotheses (Hz), (Hs), and (H;) there exist m sets of 
functions Tia(x), (a=1, - - - , m), such that 


(3.1) = 0, 


Moreover, these functions 7;.(x) may be chosen as continuous on a<x3bJ, 
and orthonormal in the sense that 


(3.2) = das, 
In view of (H;) it follows that the matrix 
Si(x)  mia(x) | 
Tia(X) 


is non-singular on ab. The reciprocal of this matrix is seen to be of the form 
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3.3 
T ja(X) 


and R;;(x)u,u;>0 for every set (w;) ~(0;) satisfying the relations 7j.(x)u;=0, 
(a=1,---,m). 

Finally, we shall suppose that there are exactly p linearly independent 
solutions y;=ui.(x), (k=1,---, p), of the system (2.2) for \=0. Bliss has — 
shown that under the hypotheses (H,)—(H,) the index of a characteristic value 
is equal to its multiplicity as a root of the characteristic equation D(A) =0. 
Hence not all values of \ can be characteristic values of (2.2) under these 
hypotheses, and without loss of generality we might assume \=0 is not a 
characteristic value. The method of proof as given by Bliss is similar to that 
used in his previous paper [2] to establish the same result under stronger 
hypotheses. However, for completeness and independence of the results of 
the present paper, use will not be made of this result. It is to be pointed out 
that Theorem 3.1 below includes the result that under hypotheses (H,)—(Hs) 
all values of \ cannot be characteristic values of (2.2). 

An arc z= [z,(x)] will be called admissible if the functions 2;(x) are of 
class D’ on ab, and satisfy the differential equations 


(3.4) = 0, 


For brevity, the class H* will be defined as the totality of admissible arcs z 
satisfying the conditions 


(3.5) t;[z] = 2;(a) Pj + ji 0, 


(3.6) = 0, 


(3.7) f Kel = 1, 


where K;;(x) = —Byx(x)Tq'(x). Since Ki;(x) (x), the ma- 
trix || K,;(x)|| is symmetric and u;K;;(x)u;=0 for all real sets (w;) ¥(0;). 

Suppose the class H* is non-vacuous, and consider the problem of mini- 
mizing the integral 


(3.8) f [2 ]dx 


in this class. In view of (3.4) we have that I[z] is non-negative in the class 
of admissible arcs. 
Because of the integral conditions (3.6) this minimum problem is not of 


| 

a= 
i=1,---,m, 
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the form of an accessory problem associated with the problem of Bolza in 
the calculus of variations. It may be reduced, however, to such a problem by 
the introduction of new variables z,,, satisfying the differential equations 
and end conditions 2,4,(@) —2n4(b) =0 

If, for brevity, we set 
(3.9) Ri(x)Mj[z] + wia(x)ua(x) = Fi(x), 


the first necessary condition for this calculus of variations problem states that 
there exist multipliers 4.(x) and constants o,, A, d:, such that in addition to 
equations (3.4), (3.5), (3.6), (3.7) we have 


(3.10) — Aig; — + AKi2; = 0, 


(3.11) Pid; — = Qisd; + = 0, 
where ¢;(x) is the function defined in terms of 2;(”), ua(x) by (3.9). Solving 
(3.4) and (3.9) simultaneously, we obtain 
Moreover, since Mi.P..;— NixsQx;=0, the equations (3.11) are equivalent to 


the conditions s;[¢]=0. Consequently, the system (3.4), (3.5), a 6), (3.10), 
(3.11) may be written 


mM; [z] = = By — AK;;2;,; 


(3.12) =0, si[t] =0, 


l,---,p. 


Now, since y;=1i.("), (k=1, , p), is a solution of £;[y] =0, s:[y] =0, 
we have that 2; =2;,(x) = 7j:(z)u;.(x) is a solution of 2; [z] =0, ¢;[z]=0. Using 
this relation, together with (2.6), we obtain in view of (H;) and (H,) the fol- 
lowing result: 

Lemma 3.1. If 2:, ¢; satisfies (3.12) with constants o,, A, then o,=0, 
(x=1, Pp). 

Lemma 3.2. The system (3.12) is normal. 

For if this system is not normal, there exist functions z;(x)=0, 
¢:(a)#0 satisfying with constants o,, A this system. In view of Lemma 3.2, 
o,=0. Then {;=7ia(x)ua(x) is a solution of ;[¢]=0, s:[¢]=0. But since 
Bi; j2=0, we then have B;;(x){;=0 and by (H,) we have ¢;(x) =0, contrary 
to the assumption that ¢;(~)40. Therefore the system (3.12) is normal. 


Lemma 3.3. The value A =0 is not a characteristic value of (3.12). 
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This result is a consequence of Lemma 3.1, relation (2.6), hypotheses (Hs) 
and (H;), together with the fact that every solution of the system £;[y] =0, 
si[y ] =0 is a linear combination of the functions u;,(x), (k=1, ---, p). 


THEOREM 3.1. The characteristic values A of (3.12) are all real and positive, 
and at most denumerably infinite in number. 


The reality of the characteristic values has been proved by the author [9]. 
The positiveness is a consequence of the above hypotheses and the identity 


which is satisfied by the functions z;(x) of a solution 2;, [:, ¢,=0 of (3.12) for 
a corresponding value A. Since A=0 is not a characteristic value, and the 
characteristic values are the zeros of a permanently convergent power series 
(see Bliss [2]), we have at most a denumerable infinity of such values. 

The following properties have been proved by Reid [9]: 


THEOREM 3.2. If the class H¥* is non-vacuous and A, is the greatest lower 
bound of I|z] in this class, then Ay>O and A=A, is the least characteristic 
value of (3.12). 


THEOREM 3.3.. Suppose Ai<Ae< are consecutive characteristic 
values of (3.12), and corresponding to A=A,, (s=1,--- , t—1), there are r, 
linearly independent solutions 2i9,, xa, =9, (Qs=1, , 7s). Define the class 
H#é as the subclass of arcs z belonging to H* and satisfying the additional condi- 
tions 


b 


If H¥ is non-vacuous and A, is the greatest lower bound of I|z] in this class, 
then A,>At+ and A=A, is a characteristic value of (3.12). 


If 2, ¢:, ¢,=0 is a solution of (3.12) for a value A, set 2;=(1/A"?)T j:(x);. 
Then i, ¢; is a solution of the system 


(3.13) = Lilt] = siln] =0, s:[¢] = 0. 


Now if is a solution of (3.13), the functions n= and 
ni=ai—bi, (:=bs—4; are also solutions of this system. Consequently, if the 
index of A as a characteristic value of (3.12) is equal to 7, there exist 7 linearly 
independent solutions 7;, ¢; of (3.13) such that for each of these solutions we 
have either 9;=¢; or 9; = —¢;. Now if is a characteristic solution of 
(3.13) we see that y;=7; is a characteristic solution of (2.2) for \= A‘; simi- 
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larly, if is a characteristic solution of (3.13), we have that y;=n; 
is a solution of (2.2) for \= —A'/?. Hence the sum of the indices of A‘? and 
— A’? as characteristic values of (2.2) is not less than the index of A as a 
characteristic value of (3.12). 

Now if \0 is a characteristic value of (2.2) and y; is a corresponding 
characteristic solution, then z;=(1/A)Tj:y;, {:=¥: is a solution of (3.12) for 
A=n?. The orthogonality of the functions z,(x) to the sets u;.(x),(k=1,--- ,p), 
is a consequence of (2.7). In view of Lemma 3.1 we have, therefore, that all 
the characteristic values of (2.2) are real. In view of (2.7) the set of character- 
istic solutions of (2.2) corresponding to the values A'/? and —A‘?, where A 
is a given positive value, are linearly independent. Hence the index of A as 
a characteristic value of (3.12) is not less than the sum of the indices of A’? 
and — A’? as characteristic values of (2.2). We have, therefore, the following 
theorem: 


THEOREM 3.4. The characteristic values of (2.2) are all real and at most 
denumerably infinite in number. If A is a characteristic value of (3.12), 
either A? or — A‘? is a characteristic value of (2.2). Conversely , if \ is a charac- 
teristic value of (2.2), then A=)? is a characteristic value of (3.12) whose index 
is equal to the sum of the indices of } and —» as characteristic values of (2.2). 


4. Sufficient conditions for the existence of infinitely many characteristic 
values. One may show by simple examples (see Bliss [3]) that the hypotheses 
of §2 do not imply the existence of infinitely many characteristic values of 
(2.2). In this section we shall give certain sets of conditions that insure this 
property. In view of Theorem 3.4 we may assume without loss of generality 
that \ =0 is not a characteristic value of (2.2), and the sets u;.,(k=1,---,p), 
are missing. If this condition is not true for the original problem it may be 
attained by a linear change of parameter. We shall make this assumption in 
the future consideration of this problem. The following general theorem follows 
from Theorem 3.4 and the extremizing properties of the characteristic values 
of (3.12) (see, for example, Reid [9] and [11]). 

THEOREM 4.1. A system (2.2) satisfying (H:)—(Hs) has an infinity of char- 
acteristic values if and only if there are infinitely many arcs 2: =Wis(x), 
(s=1, 2,---), satisfying (3.4), (3.5), and such that for each r and arbitrary 
constants (t=1, ---, 1), the arc ws=wii(x)d, satisfies the condition 


(4.1) f w;K ;;wjdx > 0. 


The following hypothesis is a weakened form of a condition used by Bliss 
in originally defining definitely self-adjoint systems. 
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(Hg) If g:(x) are arbitrary continuous functions on ad, and y;(x) is a solu- 
tion of the system 
(4.2) Lily] = Bj(x)g,(x), sily] =0, 
then y,S;y;=0 on abd if and only if y;=0. 

THEOREM 4.2. Jf (2.2) satisfies (Hi), (Hz), (Hs), (Hs), and (He), this system 
has infinitely many characteristic values. 

Hypotheses (H2), (Hs), and (Hg) are seen to imply (H,). Denote by g;.(x), 
(s=1, 2,---), sets of functions such that the sets B,;(x)g;.(x) are linearly 
independent on ab. Such a choice is clearly possible. Now let y;=yis(x) be a 
particular solution of (4.2) for gi=gi.(x), and set w(x) =7j:(x)y;.(x), 
(s=1, 2,---). By the use of relation (2.5) it may be proved that the 
functions 2;=w;,(x) satisfy (3.4) and (3.5). Furthermore, if (d,)#(0,), 
(t=1,---,r), and wi =wie(x)d, we have 


b 
f w,K,;wjdx -{ ySizvjdx > 0 


in view of (H,). The result of Theorem 4.2 is then a consequence of Theorem 
4.1. 

Coroiary. If the system (2.2) satisfies hypotheses (Hi), (H2), (Hs), and 
| B;;| #0 on ab, then (2.2) has infinitely many characteristic values. 

Hypothesis (Hg) is weaker than the condition originally used by Bliss [2] 
in defining definitely self-adjoint systems since the extra condition s;[y]=0 
has been added. The result of Theorem 4.2 is of a somewhat less general char- 
acter than that originally obtained by Bliss [2], however, since we have as- 
sumed (Hs). 

THEOREM 4.3. Suppose that (2.2) satisfies (Hi)—(Hs), and that the functions 
Bj; are of class C' on ab. Then, if the matrix 


—1 
has at some point x» of ab a rank greater than m, the system (2.2) has infinitely 
many characteristic values. 


If the functions B;; are of class C!, the functions 7;.(x) may also be chosen 
to be of class C!, and we shall suppose that these functions are so selected. 
Now a set 2; satisfies (3.4) if and only if there are functions g;(x) such that 


(4.4) = — gi(x)By(x). 


In view of (H:) and (H;) a set of continuous functions z,;(x) renders 
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[z:Kij2;dx =0 if and only if the set z; is of the form 7j;(x)rja(x)f.(x). We 
shall now determine under what conditions a set of this latter form is a solu- 
tion of (4.4). In view of (2.5), such a set satisfies (4.4) if and only if 


(4.5) Lilrafa] = Wiafa + fa 


The functions £;[7,] are continuous since 72(x) are of class C1. Now, in view 
of the hypotheses of the theorem, there is an interval a,b, containing x» and in- 
<ipSn,1Sji<je< p>m, such that the 
determinant | 1B iy , q=1,---, p; h=1,---, p—m), is different 
from zero on If g;=0 for 7¥j,, (h=1, -- - , p—m), then on a,b; there are 
continuous functions Dag(x), Ens(x) such that 


(4.6) fa = Dasfs, 
(4.7) Sin = Ensfa- 


Let fa=fay(x), (y=1,---,m) bea fundamental set of solutions of (4.6) on 
a,b,. Suppose that g(x) is a set of functions such that g*=0 for 7#j,, and 
the set g* is linearly independent of the m sets of functions g;,=Enafsy on 
a,b,. It follows that if z;=w; is the solution of 


(4.8) = — =0, 


then w; is not of the form Tj jaf. aib1, and hence >0. Now one 

may choose an infinity of sets g(x), (s=1,---), such that if r is a given 

integer and (d,)~(0,) (¢=1,---,1r), then the set g*=gi(x)d, satisfies the 

conditions described above for the set g}*. If z;:=w;, are the corresponding 

solutions of (4.8), we have that the set w;,, (s=1, - - - ), satisfies the conditions 

of Theorem 4.1, and hence (2.2) has infinitely many characteristic values. 
We have, in particular, the following result: 


CorOLLARY. Suppose that (2.2) satisfies (Hi)—(Hs), the functions B;; are of 
class C! on ab, and m<n—m. Then this system has infinitely many characteristic 
values. 


To establish this corollary, one need only note that the rank of (4.3) is 
not less than the greater of the two values m and n—m. 
5. A special boundary problem. Consider the differential equations 


5.1 


where ||@or||, are symmetric. The coefficients in (5.1) are real- 
valued and continuous on ab, and u,K,,;u, >0 on ab for (u,) #(0,). Associated 
with (5.1) we consider boundary conditions 
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(5.2) = ayene(a) — + + = 0, 
y= 1, eg 2N, 


where these conditions are linearly independent, and where the matrix 
(y, 5=1,---, is symmetric. We shall also sup- 
pose that there is no non-identically vanishing solution 7,, ¢. of (5.1), (5.2) 
with 7, =0, (c=1,---,N),onab. 

The system (5.1), (5.2) is self-adjoint with respect to the transformation 
matrix 
Ser 


Ser 


= ||_ 
Under the above conditions, 


| Orr sy -| Kar 


and (5.1), (5.2) satisfies (H:)—-(Hs) of §2. We may, therefore, assume without 
loss of generality that \=0 is not a characteristic value, and for simplicity 
we shall make this assumption. 

If we assume in addition that ||8,,|| has constant rank N —q, (0<q<WN), 
on ab, this system is equivalent to a boundary problem associated with the 
second variation of a problem of Bolza in the calculus of variations which 
is normal and non-singular, but which does not necessarily satisfy the Clebsch 
condition. If K,,=6,,, we have the problem which Hu [5] considered, and 
for which he proved the existence of infinitely many characteristic values. 
His method of proof is closely related to that previously used by Bliss [2] in 
treating definitely self-adjoint problems. In the following, we shall not as- 
sume that ||8,,|| has constant rank N—q, but simply that all the B,, are not 
identically zero on ab. 

The system adjoint to (5.1), (5.2) is 


Ue + + = 
ve + — = 0, 
(5.4) b+ + a4,0,(a) + (6) = 0, 

The condition (3.4) then becomes, in terms of (z;) = (u., 2), 
(5.5) ve + — = 0, 


= 


=| 


(5.3) 


and the integral of (3.7) reduces to 


(5.6) 


y=1,--+,2N. 
o=1,---,N, 
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Let a,b, be a subinterval of ab on which all the 8B,, are not identically 
zero, and by u, (v=1, - - - ,N+1), sets of functions of class on 
with u,,(a,) =0 =u,,(b;),and such that the V +1 sets u,,B,.,(v=1, ---,N+1), 
are linearly independent on this interval. Let 7, =v,,(x) be the solution of (5.5) 
for u,=4,,, and satisfying the initial condition v,,(a,)=0. Clearly the sets 
2, =0,,(x), (v=1,---, N+1), are linearly independent on a,b,. Now define 
- - - where the constants d,, (t=1,---,N+1), 
are not all zero and such that 2,(b.) =0. If the functions wu,, v, thus defined on 
a,b; are extended to the whole of ab by defining them as zero on the remainder 
of ab, then (z;) = (u,, v,) is an admissible arc for the above defined problem, 
and the corresponding integral (5.6) is positive. 

Now let A,, (s=1, 2, - - - ), be a denumerable set of non-overlapping sub- 
intervals of a,b,, and relative to A, define a set of functions (z;) = (wis) = (urs, Urs) 
in the manner described above. This set (w;,) is seen to satisfy the conditions 
of Theorem 4.1; hence the system (5.1) has a denumerable infinity of 
characteristic values. 

6. Oscillation and comparison theorems. The following comparison theo- 
rem is a consequence of Theorem 5.5 of Reid [11]. 


THEOREM 6.1. Suppose that the two boundary value problems 
(6.1) Lily] = sily] = + Nesyi(b) = 0, 
(6.2) Lily] = s* ly] = + = 0, 
are self-adjoint with respect to the same transformation matrix ||T;;||, that each 
of these problems satisfies (Hi)—(Hs), and \=0 is not a characteristic value for 
either of these problems. For arbitrary values L >0, let V1 [W 1] denote the number 


of characteristic values of (6.1) such that <L SL]. The numbers V 
W <* are defined for (6.2) in an analogous manner. If the matrix 


| Mi Nis 
Mi, Ni; 


has rank n+h, then for arbitrary L>0 we have |Vi.—Vi*| <h,|Wi—W Sh. 


We shall say that a point x’ on a<x’ <b is a conjugate point of « =a rela- 
tive to the differential equations 


(6.3) Lily] = 9; 


for a value X, if, using the definition of Reid [11], §6, the point x’ is a con- 
jugate point of x=a relative to the differential equations of (3.12) for the 
value A =2?. It is to be remarked that the hypotheses of §2 do not in general 
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imply that the order of abnormality of the differential equations of (3.12) on 
subintervals ab’ is constant for a<b’ <b. Hence, as pointed out in Reid [11], 
in order to obtain the usual oscillation theorem it is necessary to use the 
extended definition of conjugate point there introduced. As a result of Theo- 
rem 3.4 above, and the oscillation and comparison theorems for the system 
(3.12) (see Reid [11], §§5, 6, and 7), we have the following result: 


THEOREM 6.2. Suppose that the system (6.1) satisfies hypotheses (H,)—(Hs) 
and that }\=0 is not a characteristic value of this problem. Then for arbitrary 
values } #0 there are on the interval a<x<b at least V\,,—n and at most V,y 
conjugate points of x =a relative to (6.3) for the given X. 

In case the following additional assumption is satisfied, the conjugate 
points are given by the zeros of a certain determinant. 

(H;) If is an arbitrary value on a <b’ <6, there is no solution of ;[y]=0 
satisfying B;;(x)y,;(x)=0 on ab’ except the identically vanishing solution 
Vi= 0. 


The statement (H;) is the phrasing in terms of the system (6.3) of the con- 
dition that the differential equations of (3.12) are normal on every subinterval 
ab’. In this case the conjugate points of «=a relative to (3.12) for a value A 
are the zeros of | z;;(x)|, where 


are linearly independent solutions of the differential equations of (3.12) 
such that 
2i;(a) =0, 1,7=1, 


THEOREM 6.3. Suppose that the system (6.1) satisfies (Hz) in addition to the 
hypotheses of Theorem 6.2. If yi=yii(x:d), G=1,--- , a), denote the solutions 
of (6.3) for which y;;(a:d) =5;;, then for arbitrary values \ #0 the determinant 
(6.4) | — d)| 


has on a<x<b atleast V\,,—m and at most Vy, zeros, each zero counted a num- 
ber of times equal to its index for the system of linear homogeneous equations 
whose coefficients are the rows of the determinant (6.4). 


Theorem 6.3 is a ready consequence of the fact that 


is a linearly independent set of solutions of the differential equations of (3.12) 
for such that 2;;(a) =0, (¢, 7=1,---,m). 
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METRIC SPACES AND POSITIVE DEFINITE 
FUNCTIONS* 


BY 
I. J. SCHOENBERG 


1. Introduction. Let E,, denote the m-dimensional euclidean space and 
generally E,? the pseudo-euclidean space of m real variables with the distance 
function 


(| ar — af + | am — |?) p>o. 


As p— we get the space with the distance function max;-1,..., | |. 
Let, furthermore, /? stand for the space of real sequences with the series of 
pth powers of the absolute values convergent. Similarly let L? denote the 
space of real measurable functions in the interval (0, 1) which are summable 
to the pth power, while C shall mean the space of real continuous functions 
in the same interval. In all these spaces a distance function is assumed to be 
defined as usual.} L? is equivalent to the real Hilbert space 5. The spaces 
E,? , 7, and L? are metric only if =1, but we shall consider them also for 
positive values of » <1. Finally, if S is a (not necessarily metric) space with 
the distance function PP’, we shall denote by S(y) the new space which arises 
by changing the distance function from PP’ to PP’’, (y>0). 

A general theorem of Banach and Mazur ((1], p. 187) states that any 
separable metric space S may be imbedded isometrically in the space C. Fur- 
thermore, as a special case of a well known theorem of Urysohn, any such 
space S may be imbedded topologically in $. Isometric imbeddability of S 
in § is, however, a much more restricted property of S. 

The chief purpose of this paper is to point out the intimate relationship 
between the problem of isometric imbedding and the concept of positive defi- 
nite functions, if this concept is properly enlarged. As a first approach to this 
connection we consider here isometric imbedding in Hilbert space only. It 
turns out that the possibility of imbeddingf in § is very easily expressible 
in terms of the elementary function e~*’ and the concept of positive definite 
functions (Theorem 1). The author’s previous result ({10]) to the effect that 
.§(y), (0<y <1), which is the space arising from § by raising its metric to a 


* Presented to the Society, December 29, 1937; received by the editors December 14, 1937. 

t See, for example, Banach [1], pp. 11-12. The numbers in square brackets refer to the list of 
weferences at the end of the paper. 

t Here and below the word “imbedding” stands for “isometrical imbedding.” 
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fractional power, is imbeddable in §,* appears again as a simple consequence 
(Corollary 1). For the class of spaces S,, arising from the euclidean space E,, 
by a general change of metric of the vector type (11) below, the condition of 
imbeddability in © is directly expressible in a simple way in terms of the 
usual concept of positive definite functions as described by Mathias and 
Bochner (Theorem 2). The solution of this problem for m=1 (the problem of 
“screw lines” in §, von Neumann and Schoenberg [8]) allows us now to de- 
rive purely analytical results in the theory of positive definite functions with 
which it is equivalent. Two readily defined classes of positive definite func- 
tions are completely determined (Theorems 3 and 4). In particular two new 
proofs are given for the already known fact that the function exp [—|.| ?] 
is positive definite for values of p in the range 0<pX2, and not positive 
definite if p>27 (§5). One proof is geometrical, the other proof covering the 
case 0 <2 is analytical and may be read independently of the rest of this 
paper. 

All previous results now allow us to conclude that the spaces E,,? (vy), 1?(y), 
and L»(y) are imbeddable in § for values of y in the range 0 <y < p/2, where 
p is restricted to the range 0<p<2 (Theorem 5). For p=2 we regain our 
previous result concerning §(y). It is interesting to compare and combine 
this result about L? with a theoem of Banach and Mazur ([1], p. 203) con- 
cerning the linear dimensions of L? and L*. According to this theorem L? 
is imbeddable in L2 if g>1. Now as L?(y), (0<p<2,0<y<p/2), is imbedda- 
ble in § or L?, it follows that L?(y), (0<p<2, 0<y<>p/2), is imbeddable in 
any Lr if g>1. 

Similar as yet unsolved problems concerning the case of p>2 are shown 
to be equivalent to further knowledge as to the positive definite character of 
certain special functions of m variables. One of these unsolved problems sug- 
gests a probably possible way of extending an interesting theorem of L. M. 
Blumenthal on metric sets of four points to such sets of any number of points. 

The reader primarily interested in the geometrical results of this paper 
may omit §4 and §5 entirely by taking the known Corollary 3 for granted. 

2. Positive definite functions. A real continuous function f(x1, x2, - - - , %m) 
which is defined for all real values of its variables and is even, that is for which 
f(—m,--+, —%m)=f(%1,---, Xm), is said to be positive definite (p.d.) if 


(i) (k) (4) (k) 
(1) f(x — °°* » Sm — Sm & O 
t,k=1 


* W. A. Wilson [12], had previously remarked that ;(1/2) is imbeddable in §. 
t Due in various parts to Pélya, Mathias, and P. Lévy. For references see Bochner [3], pp. 
76-77. For still another treatment of the case 0<p<2 see Bochner [5]. 


524 I. J. SCHOENBERG [November 


for arbitrary real p; and any points (x), (¢=1,---,m), form=2,3,---. 
For n=2, (1) gives | f(a, ---, %m)| <f(0,---, 0), hence p.d. functions are 
bounded throughout space and take their maximum value at the origin. We 
shall use in the sequel only the following most simple properties of this inter- 
esting class of functions. 


I. The function defined by 


sexp [i(xit + --- + dtm, 


where y is a non-negative even function such that its integral over the whole 
space exists, is p. d. 

II. Any finite linear combination of p.d. functions with non-negative co- 
efficients is again p.d. The product of two p.d. functions is again p.d. 

III. A continuous function which is the limit of a sequence of p.d. func- 
tions is itself p.d. 

For completeness we sketch the simple proofs. Property I follows from 
the fact that the left-hand side of (1) reduces to 


n 2 
f f px exp ydu,--- du, = 0. 


k=l 


The additive property is clear from the fact that (1) is a linear inequality in f. 
The multiplicative property is a direct consequence of a lemma of I. Schur 
({11],p.10)* which states that if ainpipx, >_1 are two positive quad- 
ratic forms, then >>} a;.b;.pipx is also positive. Property III is immediately 
clear by continuity. We shall later on use the fact that f(*) =cos Az is a p.d. 
function. 

A closely allied concept is as follows. Let S be a space in which a distance 
function PP’ is defined subject to the following conditions: (1) PP’ = P’P =>0 
for arbitrary points P, P’ in S, (2) PP =0. A real continuous even function 
g(t), which is defined in the range of values of + PP’, (P, P’ in S), is said to 
be positive definite in © if 


n 


(3) = 0 


for arbitrary real p; and any points P; (different or not) of S,(m=2,3, -- -). 


* For a discussion and consequences of Schur’s result see also Pélya and Szegé [9], pp. 106-107, 
307-308. 
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This class of functions g(¢) has for a given space S properties ITI and III above 
for similar reasons. Both definitions agree if m=1, while S is the one-dimen- 
sional euclidean space Fi. 

The peculiar relationship between the definitions, which will be clearer 
later on, is already exhibited by the following simple example needed in the 
sequel. From the formula 


we get, replacing x by x;, (j=1,-- - , m), and multiplying the resulting equa- 
tions, 


exp (a? + aat)] = exp [— + 
(4) = f f exp +--+ + 2mtm) |} 


-exp [— (u? +--+ + - - dum, 
which shows at a glance (property I) that the function 
f = exp [— (a? +--+ + 


is p.d. and also that g(t)=e-* is p.d. in En. As m is arbitrary, this implies 
that the function e—* is positive definite in the real Hilbert space ©. 

3. Conditions for isometric imbedding in Hilbert space in terms of positive 
definite functions. It was pointed out by K. Menger and by the author (for 
references see [10]) that a necessary and sufficient condition that a separable 
space S be imbeddable in § is that for any »+1 points of S, (n=2), we have 


(PoP? + PoP? — = 0, 


i,k=1 
for arbitrary real p;.* Let us now put this condition in a slightly more sym- 
metrical form. By summing over the three terms separately, we may write 
this as 


pe: >> — = 0, 
1 1 1 


* This was proved for the case when © is a separable semi-metric space; that is, when the metric 
PP’ satisfies the additional condition (3) PP’>0 if PP’, whereas we postulated only that (1) 
PP'=P’P=0, (2) PP=0. However, our quadratic inequality, for n=2, insures the triangle in- 
equality PO+QR2PR for any three points of S. If we now identify with P all points Q such that 
PQ=0 (which is now allowed, since PQ=0 implies RP= RO for any R, on account of the triangle 
inequality) and do this for all points of ©, we get a new space which is not only semi-metric but even 
metric. 


n 
n n n 
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and if we set po= —). 19x, this last inequality is equivalent to 


— — PoP — = 0, 
0 1 


or, finally, 


(5) <0. 


i,k=0 


Hence the inequality (5), as a consequence of the relation 


(6) pi = 
is equivalent to the above stated condition of imbeddability. 

Now we are prepared to express this condition in terms of p.d. functions. 
We have seen in the previous section that the function e~® is p.d. in ©. 
Clearly also e~** (A real) is p.d. in § as can be seen from (4) or by direct 
reasoning. Hence e~”*’ is also p.d. in any subset of §. Now if S is to be im- 
beddable in §, it is clearly a necessary condition that e~*’” be p.d. in S. 

Let us prove now that this condition, together with the separability of S, 
is also sufficient. To prove this we have to show that (5) holds as a conse- 
quence of (6) for any +1 points P; of S. Now as e~’” is p.d. in S, we have 


(7) > pipx exp [— \*P:P?] = 0 
0 
by (3). We complete the proof in two different ways. 


First proof. By expanding the left-hand side of (7) in power series we 
have, in view of (6), 


0 0 


which clearly implies (5) for small values of X. 
Second proof. Using the formulas 


= f (1 — 
(8) 


c(a@) lf ai 0<a<2,t20, 
0 


which are immediately proved by substituting in the first integral \/-' for \, 
we have 
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PiPit = cla) (1 — exp 
0 


If now p; are any real numbers satisfying (6), we get, for 0<a<2, 


(9) = — > exp [— <0, 
t,k=0 0 0 

by (7) and the obvious fact that c(a)>0. Now we get again the desired in- 

equality (5) on allowing a in (9) to approach the limit 2. We have thus proved 

the following theorem: 


THEOREM 1. A necessary and sufficient condition that a separable space S 
with a distance function PP’ with the properties PP’ =P’P=0, PP =0, be iso- 
metrically imbeddable in $, is that the family of functions e—", (h>0), be posi- 
tive definite in S. 


Notice that the condition of this theorem may be restricted to require that 
e~* be p.d. in S only for a set of positive values of \ admitting the origin 
\=0 as a point of accumulation. The properties II and IIT (§2) will then im- 
ply that e~* is p.d. in S for all positive values of . 

Recalling that we denote by S(y), (y>0), the space obtained from S by 
replacing its metric PP’ by PP”, it is of interest to point out the further fact, 
implicitly contained in the previous second proof, that if S is imbeddable in 
, then so is S(y) for any value of y in the range 0 <y <1. Indeed, let a=2y; 
if S is imbeddable in §, then e~", (A>0), is p. d. in S; hence (9) holds 
in virtue of (6), and G(7) is therefore imbeddable in § on account of the form 
(5) of the imbeddability condition. Applying this conclusion to S = § itself, 
we have the following corollary: 


Coroiiary 1. The space S(y), (0<y<1), obtained from Hilbert space $ 
by raising its metric to a power y, is imbeddable in 9.* 


* We may even state the following more general theorem: Let 
0 


where o(d) is non-decreasing for X20 and is such that S°d-*do(d) exists. If we change the metric of 
from PP’ to [F(PP’)]"2, then the new space thus arising is imbeddable in . Indeed (6), (7), and (8’) 
imply 
dX = — f ( pipe exp [— <0, 
1 0 t 


and the theorem follows on account of the form (5) and (6) of the imbeddability condition. We leave 
open the question whether or not (8’) gives the most general function F(t) with this property. 

Added in proof, August, 1938: Formula (8’) gives indeed all functions with the property stated 
above. See the following paper Metric spaces and completely monotone functions, to appear in the 
Annals of Mathematics. 
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We now turn our attention to a special class of spaces of the type S which 
are defined as follows. Let }(x1, - - - , %m) be a continuous function defined for 
all real values of its variables with the following properties 

& G, 0,---,0 = 0, 
(10) ) ) 


and call S,, the space of points P=(x, -- - , %m) with the distance function 


For such a space, which is obviously separable, the condition of Theorem 1 
that e~** be p.d. in S,, may be expressed in a more familiar form. Indeed, 
on comparing the formulas (1), (3), and (11), we now obtain from Theorem 1 
the following theorem: 


THEOREM 2. The space Sm of m real numbers with the metric (11) is im- 
beddable in if and only if the functions 


are positive definite in the sense (1) of Mathias and Bochner for all positive 
values of 

If o(x1, +--+, Xm) is a homogeneous function,* the conditions (12) may be 
replaced by the single condition that the function 


(13) fi = exp [— (m1, -- xm)] 
be positive definite. 
We mention the following corollary: 
2. If - , Xm) is homogeneous and such that e-¢ is posi- 
tive definite, then 
0<y7<1, 


is also positive definite. 


For if e~* is p.d., then by Theorem 2, G,, is imbeddable in § and there- 
for Sn(y) is also (Corollary 1). Hence exp [—@7] is seen to be p.d. by 
applying Theorem 2 the other way around. 

4. Determination of certain classes of positive definite functions. In this 
section we shall assume m = 1, (x) being therefore a continuous non-negative 
even function vanishing at the origin. In this case we know precisely when ©, 


* We say that ¢ is homogeneous of degree « if + , Xm) holds identically 
in the x; and for ¢>0. A continuous homogeneous function ¢ with the properties (10) must, unless it 
vanishes identically, have a positive degree x. 
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which is the space of real numbers with the metric [¢(x—x’) ]!/? is imbeddable 
in § (von Neumann and Schoenberg [8]). The most general function (x) 
for which this is the case is of the form 


sin? xu 
(14) =f 
0 
where o(u) is non-decreasing for u=0 such that 
do(u 
(15) f o(u) exists. 
1 u? 


Hence by Theorem 2 we infer that if 
[ f(x) = e—¢(z) 


is p.d. for \>0, then ¢(x) is of the form (14) and conversely. If now f(x) is 
any positive function whose positive powers [f(x) |* are all p.d., then consid- 
ering 

= 
where ¢(x) is necessarily non-negative, even, and vanishing at the origin, we 


conclude as before that ¢(x) is of the form (14). We have thus proved the 
following theorem: 


THEOREM 3. The most general positive function f(x) whose positive powers 
[f(x)]*, (A>0), are all positive definite is of the form 


u? 


where a(u) is a non-decreasing function subject to the restriction (15), while c is 
any real constant. 


A few remarks are called for regarding the condition of this theorem that 
[f(x) ]* be p.d. for \>0. In the first place, as remarked after the statement of 
Theorem 1, the range of \ in this condition may be restricted to a sequence 
of positive numbers tending to zero. 

A second and more important remark is that Theorem 3 becomes false if 
we assume only that the positive function f(x) is p.d., or in other words: 
formula (16) always represents a positive and p.d. function; it does not, 
however, represent all such functions but only those whose fractional powers 
are also p. d. To prove this statement it suffices to exhibit a p.d. function 

f(x) >0 such that [f(x)]*, (0<A<1), are not all p.d. Such a function is 


f(x) = + cos? x, 


| 
| 
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if € is sufficiently small, for it is easy to see that 
[f-(x) = + cos? x)!/? 


is not p.d. for such values of ¢. Indeed, if it were p.d. for arbitrarily small e, 
it would follow that the limiting function 


lim [f.(x) =| cos «| 


were also p.d. (property III). But this is not the case as is seen from the 
fact that the cosine series 


2 4 
| cos | = —+— (— 1)" 


T n=l 4n? — 1 


cos 2nx 


has some negative coefficients, or even more directly from the fact that the 
form 

| cos (x; — 2x) | PiPky 

t, keel 
where x, =0, =7/4, x,=37/4, is not positive. Indeed, its determi- 
nant is readily found to be —1, hence negative. The essence of the matter is 
that Schur’s theorem, “If > aipip,. is positive, then so is )-a%.pip, for 
n=1, 2, 3,---,”. can not be so extended that we may conclude that 
>| aix|*pipe, (A>9), is positive. 

The third and last remark is that Theorem 3 is now equivalent to the 
theorem that (14) and (15) give the most general ¢(x) such that the space 
of real numbers with the metric [¢(x—x’) |? be imbeddable in § (see [8]). 
Hence a direct proof of Theorem 3 would furnish a new proof of that theorem. 

While the problem of determining all positive and positive definite func- 
tions is yet unsolved, there is another subclass of this class of functions which 
can now be readily determined. It will in fact be a subclass of the class deter- 
mined by Theorem 3. 

Let ¥(x) be a p.d. function, c a real constant. Clearly 


1 1 


is also p.d. in virtue of the properties II and III. It has moreover the 
following additional two properties: (a) It is bounded away from zero, since 
f(x)=exp [c—y(0)]>0. (8) All its positive powers [f(x)]*, (A>0), are also 
p.d. Let us show that the converse is true, that any function f(x) having 
the properties (a) and (8) is of the form (17) where ¥(x) is p.d. Now an f(x) 
with the properties (a) and (8) clearly belongs to the class described by Theo- 
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rem 3, and all we have to do is to decide which functions of the form (16) are 
bounded away from zero. Now this is the case if and only if the exponent (14), 
namely 


sin? xu do(u) 


u2 


is bounded in — © <x<. The characteristic conditions for this have like- 
wise been determined by von Neumann and the author in [8] and they are 
as follows: is bounded if and only if =0(+0) and exists. 
But if these conditions are fulfilled, then the non-decreasing function 


do(u) 
+ 


is bounded for u>0, and (14) can be written as 


sin? xu 
¢(x) = f do(u) = f sin? xu dr(u) 
+ +0 


0 u? 


(18) 1? 
= = (1 — cos 2xu)dr(u) = ¥(0) — v(x), 
where 


cos 2ux dr(u) 


+0 


is p.d. Hence, by (14), (18), and (16), 
f(x) = exp [c — ¢(x)] = exp [c — ¥(0) + ¥(x)] 


and is indeed of the desired form (17). We have thus proved the following 
theorem: 


THEOREM 4. The most general positive function f(x) which is bounded away 
from zero and whose positive powers [f(x)|*, (A>0), are positive definite is of the 
form 


f(x) = exp [c + ¥(x)], 
where (x) is positive definite and c is a real constant. 
EQUIVALENT STATEMENT: If f(x) is a positive function, then log f(x) differs 


by a constant from a positive definite function if and only if f(x) is bounded away 
from zero and its powers [f(x)}*, (A>0), are all positive definite. 


5. On the positive definite character of exp [—. «| 7]. We shall need in the 
next section the following well known result: 
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3. The function 
(19) exp [—| x|?] 
is positive definite if 0<p<2 and not positive definite if p>2. 

First proof. We know that exp [— x?| is p.d. from the formula just preced- 
ing (4) and property I. Hence exp [ —|x|27], (0<y<1), is p.d. by Corollary 
2, as (x) = | x| 2 is homogeneous, and the first part of the statement is proved. 
Suppose now that (19) were p.d. for a value p>2. By Theorem 2 this would 
imply that the space E,(p/2), which is the real axis with the metric | «—2x’| »/2, 
were imbeddable in §. But this is clearly impossible as E,(p/2) is not even a 
metric space since the distances among the points x =0, 1, 2 are 01 =1, 12=1, 
02 =2?/?>2, and they violate the triangle inequality. 

Second direct proof for the case 0 <p <2. First we shall prove directly on 
the basis of properties II and III (§2) that the function f(x) defined by 
(16) and (15) is p.d. By property III (7) it suffices to show that 


T sin? wu 
exp| f : | 
0 


is p.d. By the definition of the Stieltjes integral, this is a limit of functions 
of the form 


exp | — >> 2A, sin? vm, A,> 0, 


v=1 
and it suffices to show that each of the factors is p.d. Now 
exp [— 2A sin? xu]= exp [— A] exp [A cos 2xu], 


and this is p.d. because cos 2xu is p.d. and the exponential series has posi- 
tive coefficients only. This point being disposed of, it suffices to show that |x| ?, 
(0<ps2), is of the form (14).* This is apparent on account of the formula 


| «|? = f sin? sin? “-u-!-?du, 
0 0 


valid for 0<p<2, which is as easily established as the similar representa- 
tion (8). An obvious step-function for o(u) in (14), with only one jump at the 
origin, settles the case p =2. 

6. On the imbedding of the spaces E,,? , L7, (0<p<2), in by a change 
of metric. We learned in §2 that e~’ is p.d. in or L?. We shall now show 

* It is not by accident that the function (19) is of the form (16) if 0<p<2. For if (19) is p.d., 
its positive powers must also be p.d., and the function is necessarily of the form (16), by Theorem 3. 
Notice that it is not bounded away from zero and hence is not of the form (17). 
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that e—!'!” is p.d. in the spaces E,?, l?, and L® for values of p in the range 


0<p<2. 
The function exp [—|«|”] was found to be p.d. if 0<p<2. Hence the 
functions exp [— | x;| («=1, 2,---, m), are also p.d. when regarded as 


functions of the m variables x;, and we may infer, by property II (§2), that 
their product 


f = exp [— (| 0<ps2, 


is p.d. But this is equivalent to the statement that exp [—|?] 7] is p.d. in 
E,? . As the similar statement for /? is proved in the same way as for L”, only 
requiring less care, we shall limit ourselves to the consideration of L”. Let 


P; = x(t), 
where 


1 
f | | de, i=1,2,---,m, 
0 


exists, be points of For 


P;P;, = x;(t) xx(t) lat)” 


we have to show that 


(20) Dd pix exp [— ] = 0 

t,k=1 
for real p;. There is no loss of generality in assuming that the functions 2;(#) 
are continuous, as continuous functions are everywhere dense in L?. For 


we have 


(= sal?) = xi(v/m) — xx(v/m) 


v=1 


(21) 
— P;P,, M—> 


which proves the inequality (20), for (20) is already known to hold for all 
values of m if the P;P; of (20) are replaced by the quantities on the left-hand 
side of (21). 

The fact that exp [—|¢| 7], (0<p<2), is p.d. in E,”, 1”, L” implies that 
exp [—|¢|2], and therefore also exp [—A|¢|2], (A>0), is p.d. in E,?(p/2), 
1l»(p/2), and L»(p/2). We have thus proved the following theorem: 


. 
; 
} 
f 
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THEOREM 5. The spaces E,,? ,1”, and L?, (0<p 2), become imbeddable in 
if we raise their metrics to the power p/2 or less. In other words, E,? (y), l*(y), 
and L»(y) are imbeddable in for values of y in the range 0 <y< p/2.* 


Note that for p=2 this result coincides with Corollary 1. 

The fact that E,?, (0<p<1), which is not a metric space, may be made 
metric by raising its metric to a suitable power was already known in a differ- 
ent terminology. Indeed, the following readily established “substitute” for 
Minkowski’s inequality ([7], p. 32) 


1 1 1 


means precisely that E,,?(p), and therefore E,? (vy), (0<yp), is a metric 
space. Now Theorem 5 states that if we restrict this range of y to0<y<p/2, 
then not only can any three points of E,” (y) be imbedded isometrically in a 
euclidean space, but the same is true for any number of such points, since the 
whole space is imbeddable in Hilbert space. 

7. Some unsolved problems. We shall devote this last section to a few un- 
solved analytical problems and to a brief discussion of their geometrical im- 
plications. 


ProsieM 1. Let p be a real number exceeding 2, and let m=2, 3,---. Do 
there exist positive exponents x such that the function 


(22) exp [— (| + | 


is positive definite? 

If there are such positive exponents, then Corollary 2 implies the existence 
of a positive number x,, with the properties: (22) is p.d. if 0<«<km and is not 
p.d. if k>m. Furthermore Corollary 3 implies that pxm <2 or km <2/p. Theo- 
rem 2 furnishes the following geometrical equivalent to Problem 1. The func- 
tion (22) is p.d. if and only if E,”(px/2), and therefore also 17(px/2) and 
L»(px/2), are imbeddable in ©. 

Particularly interesting is the next problem concerning the limiting case 
| oO, 


* The following is a different phrasing of the fact that L?(p/2) is imbeddable in L*: If 0<p<2, 
then there is a functional y(t)=U|[x(t)], defined for all functions x(t) in L?, with values y(t) in L* such 
that the relation 


1 1 
0 0 


holds for two arbitrary x;(t), x(t) of L®. This functional y= U[x] is necessarily continuous and univa- 
lent. A study of its further properties might prove to be of interest in the theory of L?, 
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PROBLEM 2. Do there exist positive exponents y such that the function 
(23) exp [- [max (| | Sm | = min 
is positive definite? 

Again, if such exponents exist, there would exist a positive number Ym 
with the properties: (23) is p.d. if 0<yS~ym and is not p.d. if y>m. More- 
over the function (23) is p.d. if and only if £,°(y) is imbeddable in §. 

The particular interest of this second problem is due to the following 
lemma which is due essentially to Fréchet ([6], pp. 161-162). Let us call a 
jinite metric set and denote by a, a metric space composed of exactly m+1 dis- 
tinct points. 

Lemma. Any finite metric set om of m+1 points may be imbedded isometri- 
cally in 

Proof. Let Po, Pi,---, Pm denote the points of o,, and P;P; their dis- 
tances. Consider in £,° the following m+1 points: 


QO; = (x15, = (PiPi, P2Pi,- ++, PmPi), i=0,1,---,m. 


For their distance Q,Q; in E,” we have 


OV: 


max | — |= max | P;P; — P 


for certainly | P;P;—P;P.| <PiP:, (j=1,---, m), on account of the tri- 
angle inequality in o,, while the equality sign holds if 7 is equal to whichever 
of the numbers 7 or k happens to be different from zero and hence within the 
range of 7. If both i=k=0, the result QoQ.=0 was clear from the start. 

On the basis of this lemma it is natural to classify finite metric sets ac- 
cording to their “dimension” as follows: A set o, is said to be of dimension m 
(always less than or equal to n, on account of the lemma) if it is imbeddable in 
E,°, but not in E,?.. If the last requirement is removed, the dimension of o, does 
not exceed m. 

If Problem 2 were solved in the affirmative, we would get the following 
statement: If o, is any finite metric set of dimension less than or equal to m, 
then o,(y) is imbeddable in E, if 0<7<~7m, and ym is the best constant. 

This would generalize in a certain sense the following theorem due to 
L. M. Blumenthal ([2], p. 402): If o3 is a finite metric set composed of four 
points, then o3(y) is imbeddable in E; if 0<y <1/2, and 1/2 is the best constant. 

In conclusion let us point out the following perhaps not trivial remark 
concerning Problem 2: 2 exists and is greater than or equal to 1/2. Indeed, 
in view of the formula 


| 
i 
| 
| 
| 
| 
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eitz 
e ltl = 
tae 


we get by the transformation of variables u=(~+7)/2, v=(—7)/2 in the 
double integral 


i dudv 
[1 + (uw + + (u — 


at 
[it(~ + y)/2] [in(x — y)/2] —— 


r 
[- 3] 2+ ] = exp [—max(| 


due to the relation 2 max (|x], |y|)=|«+y|+|«*—y]. This formula shows 
that the function (23) is p.d. if m=2, y=1/2. Hence this much is proved: 


If on, is any finite metric set of dimension not exceeding 2, then o,(y), 
(0<y<1/2), is imbeddable in E,,. 


If y; exists, then it is certainly less than or equal to 1/2, by Blumenthal’s 
theorem, and is probably even less than 1/2. A certain special set o, shows 
that if y, exists, it must be less than 0.45. 
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ON THE TRANSITIVITY OF PERSPECTIVITY 
IN CONTINUOUS GEOMETRIES* 


BY 
ISRAEL HALPERINT 


Introduction. The class of finite dimensional projective geometries has 
been extended to include non-finite dimensional ones by J. von Neumann’s 
remarkable discovery of continuous geometries.{ In an axiomatic formulation 
of the geometry as an irreducible complemented modular lattice§ the finite- 
ness of the dimensionality is guaranteed by a chain condition. Von Neumann 
drops this chain condition and, retaining explicitly only two of its weak con- 
sequences, namely, completeness of the geometry and a certain continuity 
of the lattice operations, succeeds in establishing the existence of an essen- 
tially unique real-valued dimension function which may have either a discrete 
bounded range (the classical finite dimensional projective geometries) or a 
continuous bounded range (the new continuous geometries). In every case it 
is understood that the dimension function D(a) is to satisfy 


(1) D(a + 6) + D(ab) = D(a) + D(d) 


for all a, b. 
It is clear that such a dimension function will be closely connected with 
perspectivities. For a, b are said to be perspective if there exists a c such that 


a+c=6b+¢, ac = be; 
and for such a, b (1) implies 
D(a) + D(c) = D(a + c) + D(ac) 
= D(b + c) + D(bc) = D(b) + Dc) 


and hence, if D(c) is finite, D(a) = D(b). This motivates a definition of equi- 
dimensionality, namely, a and 0 are called equidimensional if and only if 
they are perspective. That this definition will lead to the desired dimension 
function (in an irreducible system) depends in an essential way on the funda- 


* Presented to the Society, December 30, 1936; received by the editors September 14, 1937. 

t Sterling Research Fellow. 

t See J. von Neumann: (1) Proceedings of the National Academy of Sciences, vol. 22 (1936), 
pp. 92-100, 101-108; (2) Lectures on Conti sG try, planographed, Institute for Advanced 
Study, Princeton, N. J., 1935-1937; (3) Continuous Geometry, American Mathematical Society Col- 
loquium Lectures, to appear in book form. (2) will be referred to as C.G. The writer wishes to ex- 
press his thanks to Professor von Neumann for many discussions of his new geometries. 

§ See G. Birkhoff, Annals of Mathematics, vol. 36 (1935), pp. 743-748. 
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mental theorem that a, 6 equidimensional and 8, c equidimensional together 
imply a, c equidimensional; in other words, that the relation of perspectivity 
is transitive. 

The transitivity of perspectivity has been established by von Neumann 
for reducible as well as irreducible systems* but partly by indirect methods 
which require the full force of the completeness and continuity axioms. Now 
while these axioms are indeed necessary for the existence of the dimension 
function (in irreducible systems), weaker ones will secure the transitivity of 
perspectivity (in reducible as well as irreducible systems), in fact, just those 
parts of von Neumann’s axioms which involve at most countable sets of ele- 
ments.f 

The present paper is devoted chiefly to a proof of the transitivity of per- 
spectivity which uses direct methods throughout and holds for all systems 
satisfying these weaker axioms. The paper is divided into six sections. The 
weakened set of axioms to be used is formulated in §1. We require parts of 
C.G., part I, usually in very specialized form, and for convenience these are 
collected (briefly) in §§2, 3, 4. The new material in the proof of the transi- 
tivity of perspectivity is contained in §5. The additivity and continuity prop- 
erties of perspectivity are established in §6. The Lemma 5.1 in §5 may 
perhaps be not without some interest of its own. 

1. The partially ordered system. We shall consider a system L of ele- 
ments d,b,c,---,x,¥,u,v,---,A,B,--~- whichis partially ordered, that 
is, we shall assume that a relation a<b (written equivalently )=a) holds for 
certain pairs of elements of Z in such a way that 

(i) a<b,b<c together imply a<c, and 

(ii) a<b, b<a are together equivalent to a=6. 

The following axioms are postulated: 


Axtom I. COUNTABLE COMPLETENESS. For every finite or countably infinite 


sett of elements a,, a2, - ~~ there exist the following elements: 

I,. a sum element a (written >. ,a,, or equivalently a;+a2+ - --) such that 
for any x of L, x=a if and only if x =a, for every n, 

an intersection element a (written [| or equivalently - - - ) such 


that for any x of L, x <a if and only if x Sa, for every n. 


* For the general case see C.G., part III, p. 22, Theorem 2.3; the special (irreducible) case is also 
a consequence of the theorems of C.G., part I (see C.G., part I, p. 49, corollary to Theorem 5.16). 

¢ That the “countable” axioms are really weaker than the original axioms of von Neumann can 
be shown by a simple example which satisfies the “countable” axioms but which has no zero, and 
hence is not complete. 

t All sets considered in this paper will be non-void. Thus Axiom I does not imply the existence of 
a zero or of a unit element. 
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Axiom II. COUNTABLE CONTINUITY. Let a1, d2,--- be any countably in- 
finite sequence, and let c be an arbitrary element of L. Then 


Axiom III. Moputariry. For all a, b, c, 
(a+ = {a+ (a+ c)b}c, 
or what is equivalent, a<c implies (a+b)c=a+be. 


Axiom IV. COMPLEMENTATION. For any three a, b, c such thata <b <c there 
exists an element d such that b+d=c, bd=a. 


2. Independent sets of elements. We make the following definition: 


DEFINITION 2.1. A finite (=2) or countably infinite set of elements a1, d2, - - - 
is independent (written (a,,n=1,2, - - -) L) if for every two mutually exclusive 


The a, are said to be independent over 0 if all such (>-n0i,)(Dnd;,) equal 6.* 


LemMA 2.1. If the a, are independent over 0, then 0=||ndn and (dn, 
nm=1,2,---)4. 


Proof. Since [],,¢, =4:(]],,.:2n), the lemma follows from Definition 2.1. 


LemMA 2.2. If ai, d2,--- are independent over 0, then every subset 
1s independent over 0. 


Proof. The lemma follows directly from Definition 2.1 and Lemma 2.1. 


Lema 2.3. If a, d2, - - - are independent over 0 and if (a;,,r=1,- ++) are 
mutually exclusive subsets for i=1,2,--- , then >.,a:,,i=1, 2, - - - , are inde- 
pendent over 8. 


Proof. The lemma follows immediately from Definition 2.1. 


Lema 2.4. If 0, a1, @2,- + + are such that for every two finite and mutually 
exclusive subsets ,@i, @nd , Aja 


then the a, are independent over 0. 


* If 0 is a zero element of L, that is, if a=@ holds for every a in L, then our independence over @ 
is precisely the notion of independence as used in C.G., part I, chap. 2. 


| 
| 
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Proof. Let a;,, (w=1,--- ), and a;,, (n=1), ---, be any two mutually 
exclusive subsets of a1, d2,---. Then 


and the lemma follows from Definition 2.1. 


CorOLiarY. A countably infinite set of elements is independent over 0 if 
and only if every finite (=2) subset is independent over 0. 

Proof. The corollary follows immediately from Lemmas 2.2 and 2.4. 

LemMaA 2.5. Let 0, a1, d2,-- satisfy a,20 for every n. Let tm 
be distinct integers, and let S be any set of integers not containing rm. If 
=6, then 


Proof. 


(2 
= (Xo. + Xa) = (= (Xan); 


which proves the lemma. 


Lema 2.6. If 0, ai, - are such that for every 
n=1,2,---, then the a, are independent over 0. 


Proof. By Lemma 2.4 we need only show that 


) 
xo} = 6; 
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for all finite p, and different 71, - - - , ip; 7:1, jg; and this follows from 
a finite number of applications of Lemma 2.5. 
Coro.iary. If 6, c, a1, are such that 
+ + tc) = 8 


for all n=1, p=1, then the c, a, are independent over 0. 


Proof. By Lemma 2.6, ¢, dnip, * » dn are independent over @ for 
allw2=1, p=0. Therefore, by the corollary to Lemma 2.4, c, a2, - - are in- 
dependent over 0. 

LemMaA 2.7. Let ai, d2,- be independent over 0. If S1, S2, are arbi- 
trary subsets of the integers 1,2,--- and S is the set of the integers common to 
all S,, then 


II( = Eon. 


t neS 


Proof. Let T =(a,,, d,,, - - - ) be the set of the integers not in S. Then 
t neSe neS neT t neSt 


Hat 


neS m neS 


(by repeated use of Lemma 2.5) as required. 


LemMaA 2.8. If an, (n=1,2,---), are independent over 0 and 
for n=1,---,p,andif 0Sv, Sa, forn=1,---,q,then 


(Ex)( Eu) 


n=1 n=1 n=1 


Proof. 
(uy + U2)(01 + V2) = (ty + + a2) + 02) 
= + {v2 + + 
(uy + {v2 + + arae)} = (ur + + 0101) 


= (m+ te) + U2V2) = + 


Thus the lemma holds for p=q=2. But if the lemma holds for all p=q<m, 
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then it holds for p=qg =m too. For 


m—1 m—1 


(Xm) (Xm) = (Lm) + = 


n=l n=l n=1 n=1 n=1 


m—1 


since OStm, Um Sam; and >. dm are in- 
dependent over 6. Thus the lemma holds for all p=g. If pq, say p<q, we 
can set u,=0 for p<m <q and apply the result just proved for p=q. 


LemMaA 2.9. If a1, are independent over 0 and 0S tn, Vn Sn; for 
nm=1,2,---,then 


(untn) 


n 


(by Lemma 2.8) as required. 

LemMA 2.10. Let a;, a2,--- be independent over 0. If a;2ai;286 for all 
i, j7, and if the elements a;;, 7 =1, - - - , are independent over 0 (whenever there 
are at least two elements in the set) for every i=1, 2, - - - , then the set of all 
a;;, (i, 7=1, is independent over 0. 

Proof. If a;,;,,(r=1, - - -),and ax,:,, (s=1, - - - ), are mutually exclusive 
subsets of the a;;, then 


(by Lemma 2.9), which proves the lemma. 


(Xm) (Xs) = (wats). 
Proof. 
= {oe} =@ 
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LemMaA 2.11. If and if an, a, are defined forn=1,2,--- in sucha 
way that 


= Gn + , = 0, 
for n=1, 2,---, then ([],a,), a,’, (n=1, 2, - - -), are independent over 0 and 
+] 
Proof. 


a, + On+2 An+p + II 


= af ales + + + = 0 


for all n, p. Hence ([],a,), ay’, , - - are independent over by the corol- 
lary to Lemma 2.6. Furthermore 


Tr 
+a, = +a,, 


n=1 n=1 


forr=1,2,---. Hence 


= ( « + = Ya 
r n=1 n=1 r=1 

and the lemma is proved. 

3. Perspectivities and perspective mappings. We make the following defi- 
nition: 

DEFINITION 3.1. a, b are perspective (written a~b) if there exists an ele- 
ment c such that 

(i) a+c=b+e, 

(ii) ac=be. 
Then c is called the axis of the perspectivity. 


Lema 3.1. If a, b are perspective and 0 <ab, then there exists an element d 
such that 
(i) a+d=b+d=a+b, and 
(ii) ad=bd=8. 


Proof. Let c be an axis of perspectivity for a and 6. Since 6<ab 
< {c(a+b)+ab}, Axiom IV secures the existence of an element d such that 


d+ab=c(a+b)+ ab, dab = @. 
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For this d we have 
a+d=a+c(a+ bd) =a+b. 
Similarly }+d=a+8, and (i) holds. 
ad = ad{c(a + b) + ab} = d{ab + ac(a + d)} 
= dab 
since ac(a+b) =ac Sab. Similarly bd =6, and (ii) holds. Thus d satisfies the 
requirements of the lemma. 
DeFinirTIon 3.2. The sublattice of the x satisfying x Sa is denoted by L(a). 
If ab, the sublattice of the x satisfying a<x <b is denoted by L(a, b).* 
Lemma 3.2. If a, b are perspective with axis c and @=ac=be, then a (1, 1) 
correspondence between the elements of L(@, a) and those of L(@, b) which pre- 
serves the relation < is defined by the inverse mappings 
(P) = (a1 + 
(Q) = (dy + c)a. 
Proof. If @<x<athen under (P) x—(«%+c)b, and under (Q) 
(x + {(x+b+cla=(x+ 
= {c+ x(b+o}a = ca+ x(b +0) 
x(a+c) =x. 
Hence ((Q) is inverse to (P). Similarly (P) is inverse to (Q). It follows that the 
correspondence is (1, 1). The invariance of the relation < is clear from the 
definition of (P) and ((Q). 
DEFINITION 3.3. The mappings of Lemma 3.2 are called perspective map- 
pings. 
Lemna 3.3. If a, corresponds to b, under a perspective mapping, then a,~),. 
Proof. Suppose a; corresponds to b; under a perspective mapping of L(0, a) 
on L(@, b) with axis c. Then a; is perspective to b; with axis c, for 
= (bi + +c) 
= + c)(b +c) 
aye = + = ac = bc = + = Dic, 


and conditions (i) and (ii) therefore hold. 


* The Axioms I, II, III, IV hold in Z(a) and in L(a, 6). L(a) has a unit (greatest) element, 
namely a, and L(a, b) has a unit element b and a zero (smallest) element a. 
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Lemma 3.4. If P; is a perspective mapping of L(0, a:) on L(@, bi) for 
t=1,---, p, where fori=1, - - - , p—1, then the product mapping of 
the P;is a (1,1) mapping of L(0, a:) on L(0, b,) which preserves the relation <. 


Proof. The lemma follows immediately from Definition 3.3. 
DEFINITION 3.4. The mapping of Lemma 3.4 is called a projective mapping. 


4. Transitivity of perspectivity in special cases. We prove the following 
lemma: 


Lemma 4.1. a~b, b~c, (a, b, c) L together imply a~c. 
Proof. By Lemma 3.1 x, y exist such that 
a+x=b+x=a+5, b+y=c+y=bt+e, 
ax = bx = @, by = cy = 8, 
where @=abc. 
Then a is perspective to c with axis d=(a+c)(x+y). For 
a+d=a+(a+c)(x+ y) = (a+ (a+ y) 
=(a+o(at+b+y) =(a+o(@+b+0 =ate. 
Similarly c+d=a+c. Thus a+d=c-+d, and (i) holds. Also 
ad = a(a + c)(x + y) = a(x + y) = a(a + b)(x + 9) 
=alx+o} 
Similarly cd =6. Thus ad =cd, and (ii) holds. 
LEMMA 4.2. a,~6,, for n=1, 2,--+, and (€n+b,, n=1, 2,---) L to- 
gether imply ndn~>ndn- 


Proof. By Lemma 3.1 we may assume the existence of elements x, such 
that 


Qn + Xn = bn + Xn = An + Dy, 
OnXn = bax, = 0, 


where ,(anbdn). Then > are perspective with axis for the 
relations 


an + tn = (Gn + Xn) 
= (bn t+ an) = 


give property (i); and 


n n n 


ISRAEL HALPERIN [November 


( a.) ( = (GnXn) 
= (£4)( E+) 


n n 


(by Lemma 2.9) gives property (ii). 
Lemma 4.3. If an infinite independent sequence of elements ado, - - Sat- 
isfy forn=0,1,--- , then [ndn. 


Proof. From Lemmas 2.2 and 4.1, @o~a, for all x. By Lemma 3.1 we may 
therefore assume the existence of elements x,, (w=1, 2, - - - ), such that 


do + Xn = Gn + Xn = do + ay, 
= AnXn = 8,7 
where ],a,. We deduce successively 


Qo S an + Xn, n 


Xn 


n=1 
by Lemma 2.7. Hence 
do = a>, = = («x = 6, 
n=1 p n=l Pp n=1 
if only ao>.?_ =0 for p=1, 2, - - - . Now for any fixed 


p—1l 
a>, tn = 004 in + ¢ + > 


n=1 n=1 n=1 


and, since 


By ¢ > = + ap) (« + (a = Xpdo = 9,7 


n=1 n=1 


therefore 


a>, Xn = a>, Xn- 
n=1 


A finite number of such reductions gives 


546 
= 1, , 
pmwi,Z,---, 
n=p n=1 
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Xn = = 8, 
n=1 


as required, and the lemma is proved. 
LEMMA 4.4, a~x, ax Sa, Sa together imply a,=a. 


Proof. Let ax =a,x =6. Since <a, <a, Axiom IV secures the existence of 
an element a; such that 


aq + aj =a, = 6. 


By Lemmas 3.1 and 3.2 there exist perspective mappings 7, of L(6, a) on 
L(@, x) and 72 of L(@, x) on L(6, a:). Define by induction on n 


Ti(a,), = Ti(az ), = T2(%n), On41 = T2(xn ). 


Then Lemma 2.11, the relation ax=6, and Lemma 2.10, give (a,’, x,', 
n=1, 2,---) 1. Lemma 2.2 then gives (a, , x! , a44:) L, and Lemmas 3.3 
and 4.1 give a,’ ~ai4:, for m=1, 2,---. Lemma 2.2 shows that (a,’, 
n=1, 2,---)1; hence by Lemma 4.3 =[],a,/ Thus a,=a,+6 
=4,+a,; =a; and the lemma is proved. 


DEFINITION 4.1. If 6 has been defined, we sometimes write 


(® Xn) 


(or the equivalent x1@x2® - - in place of (or x1+%2+ ---), provided 
the x, are independent over 0. If 0SuSv, then |[v—u] will denote an element 
(fixed) such that u® |[v—u] =v. (Such an element exists by Axiom IV.) 


LEMMA 4.5. a~x, x~b, ab <x together imply a~b. 


Proof. (a) Consider the special case where ab = bx=ax and xSa+b. Let 
b; = b(a+-x). Then }; is perspective to x with axis a, for 


hence relation (i) holds; and 


bia = D(a + x)a = ba = xa, 


hence (ii) holds. 

Since and =b(a+x)x = bx, bx Sb, <b; and Lemma 4.4 gives b; =b. 
Hence b<a+x. Similarly a<b+<. It follows that a is perspective to 6 with 
axis x, for 
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hence relation (i) holds, and ax = bx (by the special hypotheses of (a)) hence 
relation (ii) holds. This proves Lemma 4.5 in the special case (a). 

(8) Suppose now only that ab =bx =ax (equal, say 9). Let Ti, T2 be per- 
spective mappings of L(6, x) on L(@, a) and on 6), respectively. Set ao =a, 


Xo =x, bo =b, and define an, ax’, Xn, Xn’, bn, , for m=1, 2, - - - , by induction 
on 7 as follows: 

= Xn—1(Gn—1 + bn-1), an = T1(%n), b, = T2(%n), 

Xn], an Til(xn ), by = T2(xn ). 


If we set d=|],a,, ],x», 5=[],0,, then Lemma 2.11, the relation ab=8, 
and Lemma 2.3 give and a+6, a,’ +5,’, 


(n=1,2,---), are independent over 0. Lemma 3.3 shows that a,’ ~x,’ and 
x,/ ~b,! . Since a,’ b,’ =@ and x,’ (a,’ +0,’ ) =0, Lemmas 2.6 and 4.1 show that 
a,’ ~b,! , for n=1, 2, - - - . Now Lemma 4.2 shows that a~d if only a~8. 


a, £, 6 satisfy the hypotheses of Lemma 4.5 and the special conditions 
of (a), for b5=T2(#) imply =daxz =0, bz =bbxz =8, 
ab = dabb and, since for all n, 


és +6) = IM (I+ 


by two applications of Axiom IT,. Hence @~4; and Lemma 4.5 is proved for 
the special case (8). 
(vy) Suppose now only ab=bx (equal, say 0). Let 71, Tz be perspective 
mappings of L(@, a) on L(6, x) and of L(6, x) on L(8, b) , respectively. Set 
a, = ax, = Ti(a1) = ai, bi = T2(x1), 
ag => [a = ax|, x= T (a2), be = T2(x2). 


Then a=4a,@a2, x=%1@x2, b=): be. By Lemma 3.3, Since the 
hypotheses of Lemma 4.5 and the special conditions of (8) are clearly satis- 
fied by de, %2, be, dez~be. Lemma 4.2 now gives a~b, and Lemma 4.5 is estab- 
lished for the special case (7). 

(6) Suppose finally only the hypotheses of Lemma 4.5. The method by 
which (y) was deduced from (8) can be applied in the same way to deduce (6) 
from (vy). Thus Lemma 4.5 is proved. 


Coro.tary. If 7;, T: are perspective mappings of L(0, a) on L(0, x), and of 
x) on b), respectively, and if T2T, maps a, on then a,b, =0 implies 


n n 
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Proof. Set =71(a:). Lemmas 3.3 and 4.5 applied to a1, b:, give the 
desired result. 
5. Transitivity of perspectivity. We prove the following lemma: 


Lemma 5.1. If a,>a2= and c are given, and if 0=[],(anc), then there 
exist decompositions 


Qn = 


for n=1,2,---,suchthatay 
Proof. Let J, = [@n—(@n¢+@n41)], for m=1, 2,---, and let 
I= [a—dc]. Then 
a, = 1, (asc + a2), 
dz = Ip + as), 


and clearly cl =clac=0, cI,=cI,(a,c+a,41) for all r21, 
k20. 

We can now prove that a,=a,c@I@>_~_,In, for r=1, 2, - - - . In the first 
place, a,c, J, I,, (n=r,r+1, -- +) are independent over @ by the corollary 
to Lemma 2.6 since a,c] =@; and for all p>n2r, 


} a (a. +I+ > In) = + On41) +I+ > In) 


m=n+1 m=n+1 


= I,(an¢ + + I+ > In) 


m=n+1 
= 6. 
Secondly, 


II + m+ Tn) 


m—1 
II + dm + > I). 


m=r 


Now if m2r, 


a, = (anc + Gn+1) 
é=10 a; 
n=r 
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m—2 
+ dm + + + Om + In 


n=r n=r 


m—2 
= 4,6 + Omi + 


n=r 


= + a, = 


Thus a,c@I@>_~_,J,=]],_,(a,) =a, as required. It is now clear that if we set 


af =l@>I,, 


we will obtain the decompositions required by the lemma. 

Corotiary. In Lemma 5.1, I, In, (n=1, 2, - - - ), are independent over 0. 

Proof. This is immediate from the proof of Lemma 5.1. 

THEOREM 5.1. TRANSITIVITY OF PERSPECTIVITY. a~x, x~b together im- 
ply a~b. 

Proof. (I) Let 6=abx, and let 7,, T2 be perspective mappings of L(6, a) 
on L(6, x) and of L(@, x) on L(@, b), respectively. Let T= 727; be the product 
mapping of L(6, a) on L(@, b), and let T-! be the inverse mapping to T. We 
shall use the notation a;—),, or the equivalent b; = T(a;), to denote that b, is 
the map of a; under 7. 

(II) Let c=ab, and let ao= [a—c], where ap is restricted to satisfy a cer- 
tain condition which will be stated precisely later (see (III) below). Set 
b, =(T (ao) )c, by = [T (ao) —b:], a1 = T-1(b;), ayy = T-1(by ). Then 

a=a@e, 
Sc, ai > bi, (bic = @). 
Since T(b:) is defined. Let {7(bi)}c=be, bf =[T(b:)—be], die 
= T-1(be), T-1(b,2), 4 = ), ad = Then 


a, = a2 ao, dz — > bg SC, dz — bio — bo, (bec = 0). 


Similarly, obtain the table 


a; — bi, (bic = 8), 
a; = a2 a2 — be SC, — bie — bs, = 8), 


r= 
An — Die...n — bo3...n >On, (b,c = 
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We can now prove the following statements: 

(a) and ). 

(B) ax’, biz...n, bx’ are independent over 0, forn=1,2,---. 

(y) If we set dn=ay +0,', then dj, de, are in- 
dependent over 0. 

(6) All the primed elements ay, by, are independent 
over 6. 

Proof of (a). Let [],a,=4. Then T7*(4) =TT - - - T(@) (n factors T) is de- 
fined for n=1, 2, - - - ; and 7(@), T7(@), - - - are independent over @ by the 
corollary to Lemma 2.6 since 


{T*(a)} { T»+1(4) 
has a map which is <a 9b=a ab=aoc =8, for all m, p21. Since 
~T**1(4@) by Lemma 3.3, Lemma 4.3 shows that @=@. The statement (a) 


now follows from Lemma 2.11. 
Proof of (8). 


De (742) on + Db (4-2) (748) + + bn) = 6 


since it has a T-* map which is <a 9b = 8. The statement (8) now follows from 
the corollary to Lemma 2.6. 
Proof of (vy). 9Sdn(dnsitdnset 


a | eee bn 
S (Gn + Dig. en +n) + + (n42) + + Base 


+ + dis. + 
= + + + Ongp) + + On) 


4 bi: Bin n 
+ Gnz2 + diz. . -(n42) + + 
+ Ongp + biz. ++(n+p) + + 
=8. 


The statement (7) now follows from the corollary to Lemma 2.6. 
Proof of (5). The statement (6) follows from (8) and (y) by Lemma 2.10. 
(III) By the corollary to Lemma 4.5 ay’ ~diz...n, 
. Since a,’ , biz...n, are independent, repeated application 


| 
+ +(n+2) + + 
+ bia. + 
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of Lemma 4.1 shows that a,’ ~0,’ form=1,2,---.Nowset 
n=1 
B= (@ ba) © (@ 
n=1 r#8 


By Lemma 4.2, A~B. Furthermore 7(A) =B. 

Now suppose that a» was chosen (in (II) above) in such a way that, the b,’ 
having been defined as above, we should have (>... ,b,’ )c =6 (that such an do 
exists will be shown in (V) below). Setting 


s= E | 
we have Bg=Bgc=g (41)---2) and we can define 
h= [b— (B@g)). 


Then we clearly have 


c= (® g; 


a=A®@g, 
b=BOgOh. 


By Lemma 4.2, a~b if only h=8. 
(IV) We proceed to show that h=8. Let g’ =7-1(g+h). Since 


it follows that there exists a perspective mapping S of L(6, g’) on L(@, g)- 
Now set Ao =h and define hk,’ , h,, for n=1, 2, - - - , by induction on » as fol- 
lows: 


hn = n = ). 


Then ho, 1, - - - are independent over 6 by the corollary to Lemma 2.6 since 


+ + = (ST-)"{ +--+ + 
= (ST-1)"{ hg(In + + hy)} 
= (ST-!)*(6) = 6. 


Since , Ite! and , Lemma 4.5 shows that 
Now Lemma 4.3 gives h = ho) as required. 
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(V) To complete the proof of Theorem 5.1 we need only show that the 
a= [a—ab]| = [a—c] defined in (II) above could be chosen in such a way that 
>... ,b,/ =6 will hold in (III) above. We first note that if we set 


then it is sufficient to choose a) so that a=a)@c and 


= (Yn + ao)c, 


for n=1, 2, - - - . For if a is so chosen, then 
< + bi + be +--+ + 
= + a1 + + + 
< + ao + big + 
= + ao) + bia + + 
= T?{ b(n—1yn(nt1) (V2 + ao + bis } 
S S S = 8; 


hence 
+ bf + b2 +--- +32) =8, 


for n=1, 2,---. By Lemma 2.6, c, by, b, -- + are independent over 9; 
hence as required. 
Thus we have only to construct an a» such that 


BC, nC = (% + 


for n=1,2,---. Apply Lemma 5.1 tom 2=m= --- and c, and obtain J, 
n=1, 2,---, asin the proof of Lemma 5.1. Then m- 
Let «= ]. Then 


u(2 + = wo (1 + = 6, 
m=1 m=1 


and we can set 


m=1 


This ay satisfies our requirements, for 


ate=ct+ + 


m=1 


| 
| 
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aoc = + c)c = (7 +> 
m=1 


-(1 + Im) = 0. 


m=1 


Hence a,@c=a. And 


+ ao)c = + + + 


m=1 
= (v,¢ + ao)c 
= Une + = +O = 
This completes the proof of Theorem 5.1. 
6. Additivity and continuity properties of perspectivity. We prove the fol- 
lowing lemma: 
Lemma 6.1. If 6 is defined and if 
=-aOa, 
then a,~d2 implies . 
Proof. The perspectivity a;~a2 implies, by Lemma 3.1, the existence of 
an x for which 
x= a+ Qe, 
= dex = 6. 


Let c=[a—(a,:+a2)]. Then aj is perspective to (x-+c) with axis a, for we 
have the relation 


ata=a=c+ (a+ ae) =c+ (+ x) 
(x +c) +a, 
hence relation (i) holds; and 
aia; = 0 = xa, = (x + = {x + (a1 + 
= {x+ (a+ = (x+cn, 


hence relation (ii) holds. 
Similarly a: is perspective to (x+c). Theorem 5.1 then proves that 
a; as required. 


Lemma 6.2. If 6 is defined, and if 
a=a,@ai, 


then a~b, a,~b, together imply ai ~by . 


554 
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Proof. Let T be a perspective mapping of L(@, a) on L(6, b), and let 
u=T(a:), v=T (ay); then a,~u, af ~v, and b=u@v. Since a;~b;, Theorem 
5.1 gives u~b, and Lemma 6.1 gives v~b; . Since a’ ~v and v~b; , Theorem 
5.1 gives a ~by. 


LEMMA 6.3. d2~be, = together imply 
Proof. Let d=a:+a2+bi+be, and define a’ = [d—(a,@az) J, 
b’ = [d— (b; ® be) ]. Then 
(a’ a2) a, = d = de) 


By Lemma 6.1, (a’@a2)~(b’@be); hence by Lemma 6.2 a’~b’. Since 
a’ ® (a1 Baz) =b’ Lemma 6.1 proves as required. 
Lemma 6.4. If 6 is defined, and if a1, ---,a, and bh, - - - , are two sets 
of elements, each independent over 0, with a,~b, for r=1,---+, p, where 
p=1,2,---, then 
r=1 r=1 


Proof. Suppose the lemma established for p = for some fixed =1,2,-- -. 
Then 


(41 ~ (1 @ da), ™ 


imply, by Lemma 6.3, (ai + +6, +0n41); and the 
lemma will hold for p=n+1. Since the lemma is trivially true for p=1, it 
holds, by induction, for all p=1,2,---. 

We now define a relation a«b as follows: 


DEFINITION 6.1. ax b if a~b, for some b, Sb. 


Lema 6.5. (I) a<b implies a«b. 
(II) a<b, b«c together imply a«c. 

(III) axb, b«c together imply a«c. 
(IV) axb, b«a together imply a~b. 


Proof. (I) follows from a~a. 

(II): Let @=ac, and let T be a perspective mapping of L(6, 5) on 
L(@, c). Then T(a) is defined, a~7(a), and T(a) Sc. Hence a «c. 

(III): a«b means a~b, for some b,<b. Since b«c, (II) gives 
for some c,;<c. Theorem 5.1 then gives a~c;. Hence a «c as required. 

(IV): axb means a~b, for some bb. If b«a, then b~a, for some 
a; Sa, and b;~az for some a2 Sa, by (II). Then, by Theorem 5.1, a~a2; and, 
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since d2 <a, Lemma 4.4 implies a2 =a. Since az <a; Sa, we have a=a,~5; that 
is a~b as required. 

Lemma 6.6. If c «a and d=ca, there exists an element a, with dSa, <a and 
Cc~q,. 

Proof. c x a means c~dz for some a2 <a. Let 6 =a2d; then 0 <cae, and there 
exists, by Lemmas 3.1 and 3.2, a perspective mapping T of L(6, c) on L(@, az). 
Define c’ = [c—d] and ¢’ =T(c’); then é’ S a2 Sa, c’d =0, c’~é’, and é’d aed 
=6. By Lemma 6.3, 

c=(c @d)~(é Od) Sa, 
and a,=¢’ @d satisfies all the requirements of the lemma. 

6.7. If is defined and if a®a,=a' then a~a’ implies a, « ay. 

Proof. Let v= [(a@a;) —(a’ Gaz) |. Then a@a,=a’ @v, and Lemma 
6.1 implies a4;~(a2@v) ; thus a2 « a; as required. 

Lemma 6.8. If 0 is defined and if a®azx«a@a, then az « ay. 

Proof. By Lemma 6.6 (with the @ of the present lemma in the place of 
the d of Lemma 6.6) (a @a2) ~u, where 0 <u (a+a,). Let T be a perspective 
mapping of L(@, a+a2) on L(6, u), and let d=T(a) and d=T(a2); then 
and a~d. By Lemma 6.7, a2 and since d_~a2, Lemma 6.5 
(III) implies a2 « a,, which proves the lemma. 

Lemna 6.9. If 6 is defined, and if 

where usa and (y.@ --- @v,)a=0, (p=1, 2, - - - ), then there exist elements 
withy,~v; , 01, +, 0p imdepend- 
ent over 0, andu@v ® --- Gry Sa. 

Proof. Define a, = [a—w]; then 


and Lemma 6.8 implies (2:® - - - @vp) « a;. Thus there exists, by Lemma 6.6, 
a perspective mapping T of L(@, - - - @v,)) on L(@, a2) for some az 
satisfying Let =T(v,) for r=1,---, p. Then 2,~2,', for 
r=1,--+, p; >0?.,(@2/) Sa; and 2; , (r=1, - - - , p), are independ- 
ent over 6. Since 


r=l r=1 r=1 r=1 


st) = 6, 
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01, °° *, Vp, Vp are independent over 6 by Lemma 2.10. Thus 
v1, °°, 2%, satisfy all the requirements of the lemma. 
Lemma 6.10. Let a;52>a2= --- be an infinite set of elements, and let c sat- 


isfy where Suppose further that cxa, for n=1, 2,---. 
Then there exists an element c’ <a, such that c~c', cc' =0, and (c@c’) «<a, for 
n=1,2,---. 


Proof. Define c,= [ca;—cai4:] for t=1, 2,--- ; then and 


¢= 6, = > (® c:) ® II (ca,) 


+0= 


Suppose that ¢,,,,..-r,¢ have been defined, for all --- 
with 1<r, <p for some fixed p=1, 2,--- (m taking all values possible), in 
such a way that the following conditions are satisfied: 

(a) p Ce 2, + LS <r, <t, for r,<p, are in- 
dependent over @, 

(8), If we set 


t=rn+1 
then 
and 
t=p t=p 

where in the last summation 1, re, - - - , r, take ‘on all possible values with 


Then ¢ «@p41, and 


t=p+1 t=p+l1,rn<p 


® 


Tr<p 


t=1 t=1 
t=1 t=1 
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and Lemma 6.9 secures the existence of elements (1 S11 
< +++ <ra<p), such that S Cp~Cp rap 
<P), Cp (LSI - over 6. Hence 
if we define [Cp ANA = [Crary + + |, 
then (a) (8)p4:, and (y)p4: will be satisfied. Since (8):, and (7): 
are trivially satisfied, it follows that we can define the ¢,,,,...2,4, (for all 
1<1<n<--- <r,<t< ©), so that (a),, (8),, and (y), are satisfied for all 
p=1, 2,---. Then aq, are independent over 0, and 
for n=1, 2, - - -. Lemmas 2.3 and 4.2 now imply that ,(@cn) 
If we set we will have c~c’, cc’=6, and 
(c@c’) 

Finally, since cxa,, we have c~c? for some c? with 9<c?<a, by 
Lemma 6.6. Applying the reasoning of the preceding paragraph to c? and 
We obtain such that c?~c?’ and (c?@c?’)<a,. Then 
c’~c, c~c”, c?~c’, imply, by Theorem 5.1, c’~c?’ and hence, by Lemma 
6.3, (c@c’)~(c? @c*’). Lemma 6.5 (III) now implies that (¢@c’) «a, for 
all p=1, 2, ---,andc’ satisfies all the requirements of the lemma. 


Lemma 6.11. The hypotheses of Lemma 6.10 imply c=8. 


Proof. Suppose that &, é,---, &» have been defined for some fixed 
p=0,1,--- in sucha way that, if we write for 4+ - - - then 


(A)» c~é, for r=1,--- , 27; 

(u)» &,r=1,---, 2”, are independent over 6; 

(v)» Sa and Cy «a, form=1,2,---. 
Then there exists, by Lemma 6.10, an element c’ <a; such that cip)~c’, 
Cpe’ = 8, and (ci) «a, for n=1, 2, - - - . Let T bea perspective mapping 
of on L(@, c’), and let @,,=T7(é,) for r=1,--- , 27. Then (A) p41, 
(u) p11, and (v)p41 will be satisfied. Since we can define ¢,=c to satisfy (A)o, 
(u)o, and (v)o, it follows that we can define, by induction on #, an infinite 
sequence ¢,,(m=1,2, - - - ),satisfying (A)>,(u)», and(v),,for all p=0,1,---. 
Then we have ¢, é, - - - independent over 6 and é,~én4: (by Theorem 5.1 
since é,~C, C~én41); hence by Lemma 4.3. Since c=& we have c=90. 
This proves the lemma. 


LemMA 6.12. Without the condition cS, the remaining hypotheses of 
Lemma 6.10 imply c= 8. 


Proof. 6<ca;, cxa,, imply, by Lemma 6.6, the existence of a ¢ with 
6<c, Sa, and By Lemma 6.5 (III), and satisfy the 
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hypotheses of Lemma 6.10; hence Lemma 6.11 implies c,=6. Since c~c, =6 
and @<c, Lemma 4.4 gives c=0. This proves the lemma. 


Lemma 6.13. If 6 is defined and c«(a@b), there exists a decomposition 
C=C, with a, Coed. 

Proof. By Lemma 6.6, c~u, 0<u<(a@b). Let u;=au, define = [u—m], 
and let (a+ )b; then , and with axis a, for 


uw ta=(ui 
hence (i) is satisfied; and 
uja = uj ua = = 0 = ugha = 


hence (ii) is satisfied. 
Now let and then @c@, Sa, and 
C2~U2 Sb (by Theorem 5.1, since and ~u2). This proves the lemma. 


Lema 6.14. If andcoa, for n=1,2,---, thenc«][ndp. 


Proof. Let d=] [,@n, 9=cd. Apply Lemma 5.1 to and to 
obtain a, @d with = ---. Then 

Let co=c and d)=d. Suppose c¢,, c;’ , d,, d, , have been defined for 1<r<p 
and for some p=1, 2,--- in such a way that the following conditions are 
satisfied: 

(@)p , dp1=4, 04, ,c/ , fori sr<p. 

(B)p Cpa % dps, (p>1), and cps «(an +41), form=1,2,---. 

Then, since « +d,), we can define c, , , by Lemma 6.13, so 
that 

Cp-1 = Cp Cp ap, S Gyr. 
Now define é, = [d,_1.—4,’ ]. Then, by the use of Lemma 6.8, 
é,1=4,04,, Cp (an @a,), for m=1,2,---. 


Thus (a) p41, (8) p41 are satisfied. Since (a); and (8); are satisfied by Co, do, it 
follows that we can define by induction ¢,, c,;, @,, d , for r=1, 2,---, to 
satisfy (a), and (8), for all p=1,2,---. 

By Lemma 2.11 


Since 


«a! for r=1,2,---, 
n=1 


n=l 
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Lemma 6.12 implies], =9. Hence - ,(@c,’ ). Since 
e( > (® - ca( (® a2) = 6, 
n=1 n=1 


Lemma 2.10 implies , a,’, (n=1, 2,- +--+), are independent. Lemmas 2.3 
and 4.2 now give 


c= (@ a) Sa = [1 
n=1 n=1 
Hence c «]],a@,, which proves the lemma. 
Lemna 6.15. If 0 is defined, and if 
a@a’=)00', 
then b «a implies a’ «b’. 


Proof. Suppose b~a,Sa, and define a =[a—a]. Then a,@(a/ Ga’) 
=b@b’,and Lemma 6.2 implies (a/ ®a’) ~b’; hence by Lemma 6.5 (II) a’ « b’. 


Lema 6.16. If andifa,«cforn=1,2,---,then > na, «c. 
Proof. Let a;c=0, and define for n=2, 3,---; 
then a,=)>._”_,u,, (n=1, 2,---), and 2, m2, - - - are independent over 6 by 


Lemma 2.6, since - = lan form =1, 2, - - - . Set 
bn form=0,1,--- ; then --- , and 


bo = (@ = an @ dn, =1,2,---. 


r=1 


Hence Since u, Sa, we also have thus 
bo =>. Now define c’ = [(bo +c) —c] and bi = [(bo+c) —bo]. Then 


= bo bd = an (bn 


forn=1,2,---,and Lemma 6.15 implies c’ « @b¢ ) form=1,2,---. 
plying Lemma 6.14 to c’ and (,+50) = (b2+o') = - - - , we obtain 


+ 08) = + =O0+b) 


n 


Lemma 6.15 now implies >~,,a@, =) «c. This proves the lemma. 
DEFINITION 6.2. Jf a1, a2, - + - ts an infinite sequence, we define 


lim sup = II( Xa»), lim inf = 


P n=p n=p 
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The sequence is called convergent if lim sup a, =lim inf a,, and for a convergent 
sequence we define lim a, =lim sup @,=lim inf dp. 


Lemma 6.17. If -- - ) then lim a, is defined and is 


equal to ([[ nan). 
Proof. 
lim sup = II( = II( = 2a 
(= IL @,) = Ha); 
lim inf a, = = = 
- H+). 


Hence lim sup a,=lim inf (=][n@,) and the lemma follows from 
Definition 6.2. 


THEOREM 6.1. CONTINUITY OF PERSPECTIVITY. Jf a1, d2, - - and by, be, 
are convergent sequences with lim a,=d& and lim b,=b, then a,~b, for 
n=1,2,--- implies a~b. 


Proof. For every fixed p=1,2,--- we have 


(IIa) (24), r=p,p+1,---, 


n=p n=r 
and Lemma 6.14 implies ([[,-,¢n) «[];_,(>-,-,b,) =5. Lemma 6.16 gives 
«5, that is « 5. Similarly 5 « Then, by Lemma 6.15(IV), 
a~4, which proves the theorem. 


Coroiiary. If --- andbSbeS +--+ (4 
> ---) then a,~b, forn=1,2,--+- implies nbn nbn). 


Proof. By Lemma 6.17 this is a special case of Theorem 6.1. 


THEOREM 6.2. ADDITIVITY OF PERSPECTIVITY. If 6 is defined and if dn, 
1<n<p, and b,, 1S"<p, are each independent over 0, where p is finite or in- 
finite, then a,~b, for 1 <n <p implies 


(@a,) (br). 


n=1 
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Proof. By Lemma 6.4 >-7,_,(@®an)~>_1._,(@0,) for all r<p. If p is finite 
this proves the theorem, and if is infinite we have, using the corollary to 
Theorem 6.1, 


n=1 r n=l r n=1 n=1 


This proves the theorem.* 


* For the special case of an irreducible geometry (finite dimensional or continuous), all the 
lemmas and theorems of §6 are easy consequences of the existence of a dimension function, and 
conversely, some of them are useful in establishing the existence of the dimension function (see C.G. 
part 1, chaps. 6 and 7). The notion of a convergent sequence is given in an equivalent form by von 
Neumann, Proceedings of the National Academy of Sciences, vol. 22 (1936), p. 107 (see the defi- 
nition of lim** given there). 
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PERMANENT CONFIGURATIONS IN THE PROBLEM 
OF FIVE BODIES* 


BY 
W. L. WILLIAMS 


1. Introduction. In 1772 Lagrange found that if three bodies of arbitrary 
masses are placed at the vertices of an equilateral triangle and given the 
proper initial velocities, they will continue to be at the vertices of an equi- 
lateral triangle which may change in size, but not in shape. He also showed 
that if the three bodies are placed in a straight line and started off suitably, 
they will remain in a straight line, and the ratios of the mutual distances will 
be constant. More recently Moulton{ has extended the straight line con- 
figuration to bodies, and some particular instances of other configurations 
have been given by Hoppe,{ Andoyer,§ and Longley.|| 

A permanent configuration is a configuration of ” bodies which has the 
property that the ratio of distances between corresponding bodies is constant. 
In other words, the figure may change in size, but not in shape. The object 
of this paper is to determine necessary and sufficient conditions for any plane 
configuration of five bodies, other than a straight line configuration, to be 
permanent and to give a detailed analysis of the various types of such con- 
figurations, both convex and concave, that can be formed by rearranging the 
bodies to give pentagons of different shapes. This work may be regarded as 
an extension of that of MacMillan and Bartky{] in which the problem of four 
bodies is treated. 

2. The equations of motion. Let mm, me, - - - , ms with coordinates (x,y), 
(x2, yo), +, (%s, Ys), respectively, represent the masses of a plane system of 
five bodies referred to a set of rectangular axes, with origin at their common 
center of mass, which rotates with uniform angular velocity w. If we assume 
that the bodies attract each other according to the Newtonian law and move 
in circles around the origin with uniform angular velocity w, the differential 


* Presented to the Society, September 6, 1938; received by the editors May 20, 1937. 

t Periodic Orbits, Carnegie Institution, 1920, p. 285. 

Erweiterung der bekannten Speciallésung des Dreikérper problems, Archiv der Mathematik und 
Physik, vol. 64, p. 218. 

§ Sur Péquilibre relatif de n corps, Bulletin Astronomique, vol. 23 (1906), p. 50. 

|| Some particular solutions in the problem of n bodies, Bulletin of the American Mathematical 
Society, vol. 13 (1906-1907), p. 324. 

| Permanent configurations in the problem of four bodies, these Transactions, vol. 34 (1932), 
pp. 838-875. 
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equations of motion reduce to 


m (x; — 


wx; = 

j=l 

wy, = 


j=1 


t= 1,2,---,5;7 4 i, 


where r;,; is the distance between the masses m; and mj. 
It is to be observed that the equations 


5 5 
(2) > = 0, = 0, 
j=l i=1 
which express the fact that the origin of coordinates is at the center of mass, 
follow directly from (1). 
Define a constant 7» by 


5 
(3) w? = 
i=1 
The constant ro is to replace w and will be used later in defining the regions 


for positive masses. 
The equations (1), as a consequence of (2) and (3), may now be written 


in the equivalent form 
5 
— x;)m; = 0, 
(4) i=1,2,---,5; #3, 
— yim; = 0, 


where 


If in (4) m; and miz:, (¢=1, 2, - - - , 5), are eliminated, respectively, from 
the (t+1)th and ith equations in x and the corresponding equations in y, and 
if we denote twice the area of the triangle whose vertices have as coordinates 
any three of the points (x;, y;), (¢=1, 2, - - - ,5), by D with a pair of numbers 
(from 1 to 5) as subscripts which do not occur as subscripts in the coordinates 
of the vertices of the triangle, we have 


Si + Sigs + = 0, 


(5) 


+ + = 0, 


or 
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(6) = — Digs 
1,2,---,5, 


a system of equations, equivalent to (4), in which the variables x and y do 
not occur explicitly. It is to be understood here as well as hereafter that the 
subscripts 6, 7, 8, and 9 are to be replaced by 1, 2, 3, and 4, respectively. 
For each value of 7 in equations (6), two independent relations may be 
formed, each expressing a mass ratio as a function of the D’s and S’s. Among 
these ten mass ratios six distinct equalities exist; and when (6) is substituted 
in them we obtain, after some reduction, 
Sii¢aS ite Si, 41,843 


(7) = $= 1,2,--- 5, 


where 


(8) = 
irs 


It is to be observed that the subscript on \ corresponds to the missing sub- 
script in the left-hand members of (7). 

One may inquire whether the denominators in (6) may not vanish under 
certain conditions. That this is impossible may be seen by considering the 
several possible cases. 

It should be stated first that in the five-body problem the masses are 
all different from zero. The form of equations (1) shows that this hypothesis 
has been made. Suppose, now, that no three of the bodies are in the same 
straight line (so that no D is zero), and let one of the expressions in S, say 
So4S35—Se5S34, vanish. It follows at once from (6) that 


That is, if one denominator in (6) vanishes, all must vanish. But it is im- 
possible to satisfy all of the equations (A) simultanéously, for if it were, then 
equations (5) would show that a permanent configuration of five bodies is 
obtained in which the five masses are arbitrary. Let two of these masses ap- 
proach zero. The result, in so far as finite masses are concerned, is a perma- 
nant configuration of three bodies with arbitrary masses. The only such figure 
is an equilateral triangle. Hence, if the equations (A) are satisfied by a penta- 
gon configuration, any three vertices of this pentagon must form an equi- 
lateral triangle. This is geometrically impossible. 


q 
i 
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Consider next the case where three or more of the bodies are in the same 
straight line. The case in which they are all in a line has been treated by 
Moulton, and the method of this paper is not applicable. Outside of Moulton’s 
case there are only two five-body configurations in which some of the denomi- 
nators in (6) vanish; namely, the square and the rhombus with one mass at 
each vertex and the fifth mass at the center. These two cases are treated in 
Examples 2 and 3 on pages 578 and 579, respectively. Hence no omission of 
cases arises through regarding the denominators in (6) as non-zero. 

3. The relations among the triangular area ratios. We assume that in gen- 
eral no three of the points (x;, y;), (¢=1, 2, -- - , 5), lie in the same straight 
line. Hence they uniquely determine a conic, and this conic may be projected 
into a circle. In this process of projection the triangular area ratios remain 
unchanged; that is, the \’s defined in (8) are invariant under projection. If 
we make use of this property, we find that when the D’s are eliminated from 
(8), the \’s satisfy a system of equations of the form 


(9) Ai — Aste — = 0, 


4. A geometric property of the \’s. Let us consider, as typical of the tri- 
angular area ratios, the first one, which by (8) is 


As = Dy2D34/D14D23. 


If the points 1, 2, 3, and 4 are assigned coordinates and X; is assigned a value, 


then this equation represents a conic whose variables are the coordinates of 
the point 5, the nature of the conic depending upon the value given to As. 
The conic obviously passes through the points 1, 2, 3, and 4. It will be useful 
later to know under what condition the conic will degenerate into a pair of 
straight lines. 

Let the points 1, 2, 3, and 4 have, respectively, the coordinates (0, 0), 
(a2, 0), (xs, ys), and (a4, ys), let the coordinates of the point 5 be (x, y), and 
omit the subscript on As. Then if the equation of the conic is written out and 
the condition for a conic to degenerate into a pair of straight lines is imposed, 
it is found that 


MA + 1) = 0. 
Hence we have the following theorem: 


THEOREM 4.1. If, in any one of the triangular area ratios, the point indi- 
cated by the subscript on the associated d is allowed to vary while the other four 
points are fixed, the variable point will move in a conic which passes through the 
four fixed points and will degenerate into a pair of straight lines if, and only if, 
d has either the value 0 or —1. 


é=1,2,---,5. 
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5. The necessary condition. In order that the problem may admit a solu- 
tion other than the straight line solution, it is necessary that equations (7) 
be satisfied. However, if any three of these equations are satisfied, the re- 
maining two are also satisfied, since any set of three of them implies the other 
two. For instance, suppose we write down the following three of these equa- 
tions: 

SigSe3 — SisSea(1 + As) + = 0 
(10) SisSe4 — S14S25(1 + As) + = 0, 


So5S34 S2aS35(1 + + = 0, 


which correspond, respectively, to the values 1, 4, and 2 of 7, and write down 
the one corresponding to 7=5, namely, 


(10a) S12S35 — + As) + = 0. 


To show that this equation is implied by equations (10) multiply the second 
equation of (10) by 1523 and the third one by A;Siz and subtract, thus elimi- 
nating S45. Multiply this result throughout by A; and the first equation of 
(10) by As. Then, using the relation \3 =A1A5+A1\sA5 from (9) and subtracting, 
we have 


(11) SisS25(1 + As) — Si2S35(1 + Ar)As — SisSe3(1 — Ards) = 0. 


In order to see that this equation is identical with (10a), one only has to 
eliminate \, from the second and fourth of the equations (9) obtaining 
4=(1—AiAs)/(1+A1)As, and substitute this expression in (10a). 

6. Convex and concave pentagons. Suppose that pegs are placed at each 
of the five masses and a string is drawn tautly around them. Two cases are 
distinguishable: 

Case 1. The string touches all five pegs, and 

(a) no three pegs are in a straight line; 

(b) three pegs are in a straight line, but not four; 
(c) four pegs are in a straight line, but not five; 
(d) five pegs are in a straight line. 

Case 2 (a). The string touches only four pegs, the fifth one being inside 
the quadrilateral formed by the other four. 

Case 2 (b). The string touches only three pegs, the other two being inside 
the triangle formed by the first three. 

If the conditions of Case 1 (a), 1 (b), or 1 (c) are satisfied, the pentagon 
will be called convex. Case 1 (d) is the straight line configuration with which 
we are not concerned. If the conditions of Case 2 (a) or Case 2 (b) are satis- 
fied, the pentagon will be called concave. 


| 
| 
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In all cases we shall regard the masses as arranged in the counter-clock- 
wise order m1, Me, m3, ms, and ms. 
From equations (6) we may write 
Diz (Res — Ro)(R3s — Ro) — (Ra — Ro)(Res — Ro) 
Das (Ru — Ro)(Ras = Ro) — (Rea — Ro)(Ris — Ro) 
(Ras = Ro) (Ris Ro) (Ris Ro) (Rss Ro) 
Di (Ru — — — Ru — 
Dos (Ris Ro) (Ras Ro) (Ris Ro) (Rss Ro) 
Du (Ru — — — (Ru — 
Dox (Riz — Ro)(Rss — Ro) — (Ris — Ro)(Res — Ro) 
De Gu — — 8) - Ga 
Das (Ris — Ro)(Res — Ro) — (Res — Ro)(Ris — Ro) 
Das (Res — Ro)(Ris — Ro; — (Ris — Ro)(Ros — Ro) 
_ Dis (Ris — Ro)(Res — Ro) — (Ris — Ro)(Roa — Ro) 
Du (Res — — Rd) — — 
Dis (Ros Ro) (Ras Ro) (Rss Ro) (Res Ro) 
Das (Riz — Ro)(Ras — Ro) — (Ris — Ro)(Res — Ro) 
(Ris Ro) (Ros Ro) (Rie Ro) (Rss Ro) 


my, 3» 


3» 


3) 


3) 


5» 


5» 


Dis (Riz — Ro)(Rss — Ro) — (Ris — Ro)(Res — Ro) 
Du (Res — Ro)(Ras — Ro) — (Res — Ro)(Ras — Ro) 
Des (Ris — Ro)(Rea — Ro) — (Ria — Ro)( Res — Ro) 
Dye (Ris = Ro) Ro) (Ris Ro) (Ras Ro) 
Du (Ris — Ro)(Res — Ro) — (Ris — Ro)(Res — Ro) 


5» 


me m5. 
For a convex pentagon, the ten triangular area ratios in these equations 
satisfy the inequalities 


Dy2/De3 > 9, > 0, > 0, Deos/Do3 > 0, 
(13) D45/D35 > 0, Das/Dsa > 9, Dis/Dis > 0, Dus/Dis > 0, 
> 0, Di2/Dis > 0. 


For the general concave pentagon of the type described in Case 2 (a), there 
are four different positions that the fifth mass may occupy within the quad- 
rilateral formed by the other four masses. These positions are shown in Figs. 
la, 1b, 1c, and 1d. The following table gives the signs of the triangular area 
ratios for each of these four cases: 
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Da | Da | Du | Du | Du | Du | Du | Du | De | Du 
Figure | | Dis | Du | Des | Das | | Das | Dis | Dos | Dis 
la _ + + + + + + + _ —_ 
1b - + + + - 
1c - + + + - 
1d + + 
Fic. la Fic. 1b Fic. 1c Fic. 1d 


7. Admissible convex pentagons. An admissible pentagon is one for which 
there exist positive masses which, when placed at its vertices and started off 
suitably, will give a permanent configuration. 

In any solution of the problem, one sees from (10) that the following 
equations 
(Ris Ro) (Res and Ro) (Ris Ro) (Ras Ro) 
(Ris — Ro)(Res — Ro) — (Ris — Ro)(Ras — Ro) DisDas 
(Rea Ro) (Ris Ro) (Ris Ro) (Res Ro) Dy2D 5 
(Ris — Ro)(Res — Ro) — (Riz — Ro)(Ras — Ro) 
(Rss — Ro)(Res — Ro) — (Res — Ro)(Ras — Ro) DasDas 
(Res — Ro)(Rss — Ro) — (Res — Ro)(Ris — Ro) DsaDas 


(14) 


must be satisfied, and in addition, for any solution, equations (12) must yield 
positive masses. 

Since for convex pentagons the inequalities (13) must hold, and since the 
masses are necessarily positive, it follows that the numerator and denomina- 
tor involving the R’s in each of the equations (12) must be of like sign. If we 
let the numerator and denominator of the first equation be positive, the signs 
of the numerator and denominator in each of the other equations are deter- 
mined, and we find that 


b 
[ 
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— Ro)(Rit2,i43 — Ro) > (Rivne — Ro) (Ri+1,i43 — Ro) 
> (Ri,i+3 — — Ro), i=1,---,5. 
We can choose three of the points arbitrarily. Let these be the first three. 
We then have given 7, 723, and 713 and can therefore choose their order rela- 
tion. Let this relation be 


(15) 


S ro S ris. 


Draw ry as in Fig. 2, and let m, and mz be at its extremities. With m, and mz 
as centers draw semicircles with radius ro, intersecting at O. With O as center 
describe circular arc @,b,c4bea2 with radius ro. Then with mz; in the region Oagbz, 
m, in the region bounded by circular arcs Ob;, Obs, and b,c4be, and ms in the 
region Oa,h,, the inequalities (15) are all satisfied, subject to the condition 


(16) Y23, 734, 15, S S 113, 714, 24, 725, 735 


which assures that the masses in the regions specified are all positive. The 
proof that with the masses in these regions equations (14) are satisfied is too 
long to write down here. Only the method of procedure will be indicated. 
In Fig. 2 take m; in its region as shown or on the boundary of this region, 
and with this point as center and with 7, as radius describe circular arc ¢1C2¢3C4, 
intersecting arc Oa, at c2, arc Ob; at cz, and arc b,be at cy. On account of the 
inequalities (16), the mass m, is further restricted to the region Ocscabe. Let 


this region be covered with an infinity of arcs, all passing through the point O 
and intersecting the boundary arc c3c,. It can be shown that on each of these 
arcs there is one and but one point at which the first of equations (14) is 
satisfied. If these points are joined by a curve, we have the curve on which 
point m, moves as ms; varies its position in the region Odbe. Imagine m, on 
this curve, and with it as center and 7p as radius describe circular arc ab; 
intersecting Ob; at b; and Oa, at some point a; below ¢c. The remaining two 
equations of (14), then, determine two curves which pass through m, and in- 
tersect in the region ¢2c3b3a3, giving the position of ms. As the point m, varies 
its position on its curve, ms moves on some curve in the region ¢2¢3b3;d3. These 
results lead to the following theorem: 


THEOREM 7.1. For every point mz in the region Obed2, or on the boundary of 
this region, there exists a single infinity of points mg in the region Ocscyb2 and a 
single infinity of points ms in the region Oa,b,, which together with the points m, 
and mz form an admissible convex pentagon. 

8. Limitations on the interior angles. In Fig. 2 let the angles of the pen- 


tagon at its five vertices be denoted by the masses at these points. For ex- 
ample, angle memms will be denoted by m, and angle mmzms by me. 
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It is obvious that angle m is equal to or greater than angle mzm,O and 
that this latter angle obtains its least value when 72.=7ro. This value is seen 
to be 60° from Fig. 3, which is the same as Fig. 2 except that it is drawn with 
’o=f.. Therefore angle m; is equal to or greater than 60°, and the same is 
clearly true for angle mz. 

Since ri2=7o, the first of equations (14) shows that the point m, moves on 
some curve through the point O, such as curve n, Fig. 3. We see, however, 
that the smallest value of angle m; is obtained when point ms is at b. and point 
mz, is at O, in which case it is 60°. By symmetry it follows that the smallest 
value of angle ms; is also 60° and is obtained when point m, is at O and point ms 
is at dy. 


Consider now angle m,. In Fig. 3, let points m; and ms be at mj and mé, 
respectively, so that 735 equals 7o, its least value. With these two points as 
centers describe arcs g and s, respectively, cach with radius ro = 712 =735. From 
the inequality (16) we see that the point m, cannot lie outside of the region 
bounded by the arcs g, s and the arcs of the circles drawn with the points m 
and mz, as centers. It is clear then that angle m, is smallest when point m, lies 
at the intersection of arcs g and s, and that it is then 60°. 

Hence no interior angle of the convex pentagon can be less than 60°. 

The maximum value of the angle m, is also reached when r12.=70, as can 
be seen from Figs. 2 and 3, and is that angle 6, Fig. 3, which the circle with O 
as center makes with the line riz. 

In order to find @ let the line through the points m and mz be the x axis, 
and let these two points be symmetric with respect to the origin, so that the 
point O will be on the y axis. Then if the equation of the circle with O as 
center and 7p as radius is formed, it follows easily that the angle @ which it 
makes with the line 712 is 150°. 


| 
| 
| 
” 
| 
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It is obvious from Fig. 3 that angle mz also has this angle for its maximum 
value; and this is true for angles m3, m4, and ms as well, as can be shown by 
going through the same procedure for these angles as for angle m. 


THEOREM 8.1. Each of the interior angles of an admissible convex pentagon 
lies between 60° and 150°. 


9. Admissible concave pentagons. We shall consider the general case 
where one of the points m; is within the quadrilateral formed by the other 
four. Let the vertices of the quadrilateral be m, m2, m3, and ms, and let the 
interior point be m, whose position is that shown in Fig. 1d. 

For concave pentagons of the type under consideration, the signs indi- 
cated in the last row in the table on page 569 must obtain, and since the 
masses are positive, the R’s in equations (12) must satisfy the relations 


(Riz — Ro)(Ras — Ro) > (Ris — Ro)(Ros — Ro) < (Res — Ro)(Ris — Ro), 
(Raa — Ro)(Ris — Ro) > (Ris — Ro)(Rss — Ro) > (Ris — Ro)(Ras — Ro), 
(17) (Rss — Ro)(Res — Ro) < (Res — Ro)(Ras — Ro) > (Res — Ro)(Ras — Ro), 
(Res — Ro)(Ris — Ro) < (Ris — Ro)(Res — Ro) > (Riz — Ro)(Rss — Ro), 
(Ris — Ro)(Res — Ro) > (Ris — Ro)(Res — Ro) < (Riz — Ro)(Rs4 — Ro)- 


If m, is in the position shown in either Fig. 1a or Fig. 1b, then the first in- 
equality sign in the second inequality and the second inequality sign in the 
third inequality must be reversed, while there is no change if m, is in the posi- 
tion shown in Fig. ic. 

Here, as in the convex case, we may select three of the five points of the 
pentagon arbitrarily, and we shall take as these three the points m, mz, and 
m3. Also let the order relation between riz, 723, and be riz, 723 S70 

In Fig. 4 draw riz with the masses m, and m: at its extremities. With these 
two points as centers draw semicircles with radius 7 intersecting at O. With O 
as center describe circular arcs a,b, and debe, each with radius ro. On account 
of the inequalities (17), m; will lie in the region Obsa2, ms in Od,d2, and ms in 
Oa,b,, with the additional condition 


12, 23, 134, 14, 15, 725, 745 S Yo S 113, 125- 


With the masses in these regions it may be shown that equations (14) are 
satisfied. Hence we have the companion theorem: 


THEOREM 9.1. For every point ms; in the region Obsdz, or on the boundary 
of this region, there is a single infinity of points m, in the region Od,d2 and a 
single infinity of points ms in the region Oa,b, which, together with the points m, 
and m2, form an admissible concave pentagon. 
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10. Symmetric pentagons. Suppose that 
= 115, = 124, 34 = 145, = 125, 
and consequently that 
Do = Dis, Dis Du Das, Des = Djs. 
The configurations Fig. 5 then present themselves in the form of isosceles 
trapezoids with vertices at the points m,, m2, ms, and ms and with the point m, 


somewhere on the line perpendicular to and bisecting the parallel sides ry» 
and 735. The construction of the figure is the same as that of Fig. 2. 


Fic, 4 Fic. 5 


Equations (14) reduce to the two independent equations 


(Res — Ra)(Res — Ru) 
(Ris — Ro)(Res — Ro) — (Riz — Ro)(Rus — Ro) DasDig 
(Ris Ro) (Ra Ro) (Res Ro)(R3s Ro) Dis 
(Res — Ro)(Ras — Ro) — (Res — Ro)(Ras — Ro) Das 


(18.1) 


(18.2) 


and the mass equations (12) become 


©. (Res — Ro)(Ris — Res) 
m, m2 Diz (Res — Ro)(Rss — Ro) — (Ris — Ro)(Ras — Ro) 

ns 2.2.2 (Res — Ro)(Ris — Res) 
m2 Diz (Res — Ro)(Rsa — Ro) — (Rea — Ro)(Rss — Ro) 

(19.3) = _ Dis (Ris ~ Ro)(Res — Ro) — (Ris — Ro)(Rus — Ro) 


ms Dos (Ris Ro) (Ris Ro) = (Rie Ro) (Rss Ro) 


ms = M3. 


From either (19.1) or (19.2) it follows that we have also m,=m». 


2 
4 Sho 0 - 
ax \ i FAL ‘ 
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For convex pentagons all of the triangular area ratios in these equations 
are positive. For concave pentagons D,;/D1. and Des/D1. are negative, if the 
point m, is above the diagonal 7;; otherwise they are positive. Moreover 

The convex case. In Fig. 5, r:2 is regarded as fixed with the masses m, and 
mz at its extremities, while the point m; (and its symmetric point ms) varies 
when the point m, moves from O towards C2, as m, must do since re4=7o for 
convex pentagons. 

When the point m, is at O, equations (18.1) and (18.2) become, respec- 
tively, 

(Rie Ro) (Rs Ro) Di2D 45 0, 


(R34 — Ro) [(Ris — Ro)Dos + (Res — Ro) = 0. 


In the first equation since neither (Ris— Ro), Dw, nor D4; is zero, we must 
have (R3,— Ro) =0; whence r34=7o. It then follows that point ms is somewhere 
on the circular arc m,b,c2b.a2me, and it can be proved that it is at some point dz 
between a2 and b, with m; at the symmetric position d,. 

As the point m, moves from O towards ¢2, point m; moves on some curve ” 
from d, towards O, and in like manner, point ms; moves from d; towards O 
along some curve »’ which is symmetric to ». When m, reaches a position 
on Ocz such that the points m; and m; have moved in towards each other on 
curves m and n’, respectively, so that r35=7o, then the limit for permanent 
convex configurations is attained, since by inequalities (16), rss=7o. 

The concave case. As the point m, moves from O towards ci, as it must do 
for concave pentagons since 7247, point m; moves on some curve s from dz 
towards O, and in like manner, point ms; moves on some curve s’, symmetric 
to s from d; towards O. However, since for concave pentagons 735 <7o, these 
curves are of no interest until points m; and m; reach positions on curves s 
and s’, respectively, where 735 = 

Let the corresponding position of m, be O’. If one starts with the point m4 
in this position, as it moves on towards c; points m; and m, move towards O. 
When they reach this position one has Diz=0, D1s/D25= —1. 
Equations (18.1) and (18.2) are then both satisfied, regardless of the position 
of point m, on Oc. 

Thus, for a given riz and ro, as point ms, moves from O towards ¢2, point ms 
moves from dz towards O along n, and point ms; moves from d, towards O 
along n’, giving, in all positions for which 735270, a convex pentagon; while 
as point m, moves from O’ towards c:, points ms; and m; move towards O 
along s and s’, respectively, giving, in all positions where 73;<7o, a concave 
pentagon. Since the mass ratios are all positive for all of these pentagons, as 
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can be seen from equations (19.1), (19.2), and (19.3), we can state the follow- 
ing theorem: 
THEOREM 10.1. For every riz and ro there exists a single infinity of perma- 


nent configurations with one mass at each of the vertices of an isosceles trapezoid 
and with the fifth on the line perpendicular to and bisecting the two parallel sides. 


As the point m, moves from O towards ¢, Fig. 5, and consequently, points 
mz and ms move from d; and d, towards the symmetric points on » and n’ 
where (r3;—79) vanishes, one sees from equations (19.1), (19.2), and (19.3) 
that the mass ratios 


change steadily from the values 
Dis(Ris R23) Dos(Ris R23) 
Rss + Ris 2Ro) Dy42(2Ro Ro R35) 


0, 


respectively, to 
Ro) (Ris R23) Deos(Ros <> Ro) (Ris R23) 
Dia(Ro — — Ro) — Ro)(Raa — Ro) 
Dis(Ri2 Ro) (Rsa Ro) (Ris Ro) (Rea Ro) 


Des(Rizs — Ro)?* 


where the D’s and R’s are to be evaluated at the points in question. Similar 
limits on the mass ratio could be found for the concave case as m4 moves 
from O’ towards c:. We therefore have the following theorem: 


THEOREM 10.2. For every m=m2,>0, ms=m;>0, and m,>0, there is one 


and only one isosceles trapezoid configuration. 
11. The case of three points on a line. Let m, be on the line 7; joining m 
and ms, and let mz and m; be equidistant from 713. We have at once 
(20) Tig = 115, Tes = 135, Yo, = 145, 
Des = 0, D35 = Dos, Dis = Das, Dis = — Dr, 


and the equations (14) reduce to the single equation 
(Ris Ro) (Res Ro) (Ris Ro) (Rea Ro) D3sD15 
(Ris Ro) (Rea Ro) (Rie Ro) (Rs Ro) 


(21) 


Let points m and ms; be fixed. Then for a given mz, and for any position of 
point m,, the equation (21) to be satisfied, by the coordinates of point ms, 


Ms m3 ms ms ms 
my, me me my, ms m4 
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is the equation of a straight line. But for a solution the coordinates of ms 
must also satisfy the first two equations of (20). (The remaining equations 
of (20) are then automatically satisfied.) These equations, when rationalized, 
represent circles with centers at the points m, and m; and with radii ri. and 
Yo, respectively. 

In Fig. 6 draw 713 with m, and m; at its extremities. Take point m2 any- 
where not on this line, and draw the circles of the above equations. It is ob- 
vious that these two equations are satisfied by real points if, and only if, 
point m; lies at the intersection of their circles. That is, point m, will either 
be at the point a, Fig. 6, or it will coincide with point m2. Thus it may be in 
one of two positions in so far as these two equations are concerned. 

Equation (21) must be satisfied also. By Theorem 4.1, the left member of 
this equation must be either 0 or —1. If it is 0, the equation reduces to 


Ds, = 0, 


which is the equation of a straight line passing through points m, and mz; 
therefore, in order for the point m; to satisfy all the equations, it must coin- 
cide with me. This is a case of no interest as 72, may vanish for neither convex 
nor concave pentagons. 

Suppose the left-hand member has the value —1. This gives the equation 


Ri Ro Rus Ro 
Ros Ro R34 Ro 


(22) 


The question arises as to whether or not the line with this equation can be 
made to pass through the point a. In other words, are there configurations 
other than the one where points mz and m; coincide? The proof that the an- 
swer to this question is in the affirmative is not difficult, but an illustrative 
example would, perhaps, be more instructive. 

Let the point m, be midway between points m, and m;. The equation (22) 
is always satisfied, and if the point mz, is given a position such that the mass 
ratios are all positive, we have a permanent configuration which, in general, 
will be a rhombus with vertices at mm, m2, m3, and ms, and with m, at its center. 

In Fig. 7, draw ris, and consider the point mz as it moves on a line per- 
pendicular to 73 at its mid-point m,. The positions that it may assume are 
limited only by the regions for positive masses, since the condition (22) is 
satisfied for any position of m2 regardless of what 77 may be. Moreover where- 
ever point m2 may be on this line, we have, from (12), 


my me 


m3 ms 


THE PROBLEM OF FIVE BODIES 


Fic. 6 Fic. 7 


With points m, and m; as centers draw circular arcs a; and a; with a radius 
ro, (ro<mriz), such that the distance between their intersections 2 and #3 is 
equal to ro. Since 725270, we have, with point m2 at p3 and ms; at po, the 
rhombus configuration in which the point mz is as close to m; as possible for 
a given 73. In this case the mass equations (12) show that 


m3 (Ris Ro) 


> 0, 


The greatest possible distance between the points mz and ms, for a given 
113, is obtained when we choose an 79=713. Draw circular arcs }; and b, with 
such a radius. Then, with m, and p, and ms; at p:, we have the rhombus con- 
figuration in which the distance between points m2 and ms is as great as possi- 


ble. Here, we have 


(24) = = 
me (Ros — Ro) 

THEOREM 11.1. Given ri3 with masses m, and mz at its extremities and with 

the mass m, at its center, for a properly chosen ro, and for all positions of m2 and 

ms on a line perpendicular to ris at mg such that the mass ratios have values lying 

between those given in (23) and those in (24), the configuration is permanent and 
is a rhombus with vertices at m1, m2, m3, and ms, and with mz, at its center. 


In concluding this section we may add that if any four of the masses are 
on the same straight line, the fifth mass will be on this line also. This follows 
readily from equations (14). 

12. The triangle with two interior masses. One other possibility of a con- 
cave pentagon is that of a triangle with one mass at each vertex and with the 
two remaining masses in its interior. That no such configuration, however, 
can form an admissible pentagon follows from the inequalities 


112, 723, 134, 714, 124, 715, 735, 745 713, 725 
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(see §9). To construct a pentagon of this type one must alter the above order 
relation that must exist among the sides and diagonals, and the result is al- 
ways to cause some of the masses to be negative. 

13. Numerical solutions. We consider the following examples: 

EXAMPLE 1. An interesting example of a convex permanent configuration 
is that of the regular pentagon. In Fig. 5 as the point m, moves upward from 
O towards cz and the points m; and ms; move towards each other along curves 
n and n’, somewhere along these paths the configuration becomes that of a 
regular pentagon. When this position is reached, the equations (18.1) and 
(18.2) reduce to the single equation 


Ros — Ro Dis 12 


(25) @ 
2Ro — Rie — Des 
or 
3 3 3 2 3 
(26) (ro 12% 24 — 2rieres + 70 (res + ri2) = 0. 


For any given value of 713, 724 may be computed from the obvious relation 


= 1.6180ri2, 


and r, may be found from (26). It follows readily that 


To 134 145 T15 

— = 1.3076, —=—=—=—-1, 

T13 T24 T14 125 35 
—=—=— =— =— = 1.6180, 


and from (19.1), (19.2), (19.3), together with (25), that m, =m; =m,= ms. 

EXAMPLE 2. Perhaps the most simple concave permanent configuration 
in the five-body problem is that of a square with one mass at each of its ver- 
tices and with the fifth mass at its center. In Fig. 5 this particular configura- 
tion is passed through as the point m, moves from O towards ¢; and as points 
m; and m; move towards each other along curves s and s’. Equations (18.1) 
and (18.2) are now identically satisfied for any value of ro, while the mass 


equation (19.1) shows that 


my, me 


and since for all of the symmetric configurations, Fig. 5, m3=ms, mi =e, 
we have m,=m2=M3;= Ms. 


1938] THE PROBLEM OF FIVE BODIES 


The equation (19.3) becomes 


3 
(ro — is 


m3; Ms 1 


2 3 3 3 
2 2Ro — Rw — Ris 2r3a(2rieris — To Tis — To 


3 1/2 3 


(27) 


since rj, = 2r, =4rj,. 

Thus, while equations (18.1) and (18.2) are satisfied for any 79, the mass 
ratios m;/ms=m;/m, are not necessarily positive for any value of ro. It is 
found that for the last right-hand member of (27) to be positive we must have 


0.878 < — < 1.000. 
ro 


It then follows that 
M3 Ms 
0.439 < — = — = 1.414. 
This result shows that the central mass is arbitrary within certain limits. In 
general, if a permanent configuration is given, the masses are uniquely deter- 
mined. 

EXAMPLE 3. In Fig. 7 consider the rhombus configuration (1) with masses 
m1, M2, M3, and m; at its vertices and with the mass m, at its center. By Theo- 
rem 11.1, for a given 7» the point m: may be anywhere on the line perpendicu- 
lar to ris at its mid-point such that the mass ratios are positive. It is readily 
found that 

Yo ros 35 T1434 


—=—=1, —=—=0.781, 


rie 


r 
1.249, —=1.562; 


my, ms M4 
.617, —=0.536, — 

m3 

As the point mz moves on towards #, and the point ms moves towards fy, 
Fig. 7, the polygon moves through the square configuration (2), but this has 
already been discussed in Example 2 in connection with Fig. 5. 
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579 
Rss — Ro 
M4 Ms 
145 125 
—=—=0.625, —= 
and from the mass equations (12) we have, 


